Enzymes that cleave non-glycosidic ether bonds between lignins or derivatives thereof and saccharides

ABSTRACT

The patent application relates to isolated polypeptides that specifically cleave non-glycosidic ether bonds between lignins or derivatives thereof and saccharides, and to cDNAs encoding the polypeptides. The patent application also relates to nucleic acid constructs, expression vectors and host cells comprising the cDNAs, as well as methods of producing and using the isolated polypeptides for treating pulp and biomass to increase soluble saccharide yield and enrich lignin fractions.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of priority of U.S. ProvisionalPatent Application Ser. No. 62/050,594, filed Sep. 15, 2014, and U.S.Provisional Patent Application Ser. No. 62/016,329, filed Jun. 24, 2014;which are hereby incorporated by reference in their entirety.

GOVERNMENT SUPPORT

This invention was made with government support under National ScienceFoundation Grants 1046844 and 1315023, Department of Energy GrantDE-FG02-07ER84788, Maine Technology Institute Grants SG1537, SG1793,SG3446, DA708 and DA1613 and two Department of Transportation Sun GrantInitiative Awards. The government has certain rights in the invention.

REFERENCE TO A SEQUENCE LISTING

This application contains a Sequence Listing, named“39858.0202_ST25.txt” and having a size of 33,302 bytes and created Jun.23, 2015, is incorporated herein by reference.

FIELD OF THE INVENTION

The patent application relates to isolated polypeptides thatspecifically cleave a non-glycosidic ether bond between a lignin or aderivative thereof and a saccharide, and cDNAs encoding thepolypeptides. The patent application also relates to nucleic acidconstructs, expression vectors and host cells comprising the cDNAs aswell as methods of producing and using the isolated polypeptides fortreating pulp and biomass to increase soluble saccharide content andenrich lignin content.

BACKGROUND OF THE INVENTION

Today, the United States imports vast amounts of petroleum to helpsatisfy its energy requirements. Volatile pricing, supply limitations,greenhouse gas emissions and the political and military costs associatedwith fossil fuels have all led to renewed interest in energyalternatives. But in addition to its use for fuel, many important basicchemical commodities are produced from oil. In fact, approximately 5% ofthe total output of a petroleum refinery is used by the chemicalprocessing industry as raw materials (See Ragauskas, A. J. et al., Thepath forward for biofuels and biomaterials, Science 311:484-489 (2006),which is hereby incorporated by reference in its entirety) and as thecost of oil rises, so too does the cost of downstream commoditychemicals like plastic resins.

Fortunately, biomass can be a substitute feedstock for the production offuel as well as many of the building block chemicals that are currentlyproduced from oil. Because petroleum and biomass are both carbon-based,chemicals (including relatively linear polymers and polymer buildingblocks) that are based on non-renewable petroleum products can also beproduced from renewable biomass using current fermentation techniques.

Unfortunately, the use of agricultural crops as a source of biomass forchemicals and energy suffers from at least two disadvantages. First,raising crops is oil intensive. Farm equipment needs fuel to weed, tilland harvest. Moreover, fertilizers and pesticides are often producedfrom petroleum. Second, crops and land are needed to feed both humansand domestic animals, and their use to produce an industrial feedstockescalates competition between the use of land for food or for fuel. TheUnited Nations Food & Agricultural Organization (FAO) reports that theaverage price of corn increased 85% between the years of 2000 and 2007as a direct result of rising farm energy costs and increased demand fromethanol and bioplastics producers. In fact, in 2007 about 25% of the UScorn crop was diverted into ethanol production and the rate isaccelerating. See ICIS Chemical Business, Biofuels backlash grows infuel versus food debate, Simon Robinson, London, Feb. 11, 2008, which ishereby incorporated by reference in its entirety.

A better alternative to the use of agricultural biomass is selectivetree cutting, which is sustainable and requires little cultivation. Dueto vertical tree growth, it produces a much greater yield of biomass peracre. In a recent report, the Pacific Northwest National Laboratory(PNNL) and the National Renewable Energy Laboratory (NREL) summarizedthe results of an extensive screening study of the possibilities forprocessing the sugars derived from woody biomass into basic chemicals.See Werpy, T. and Peterson, G., Top value added chemicals from biomass,Volume I, Results of screening for potential candidates from sugars andsynthesis gas, produced for the NREL, Publication No.DOE/GO-102004-1992, August 2004, which is hereby incorporated byreference in its entirety. Among the 300 possible products listed werethe top 30 building block chemicals of industry. Of particular interestare itaconic acid and lactic acid, the bifunctional organic acidsderived from fermentation that can be made into a wide variety ofplastic products.

North American forests contain huge amounts of woody biomass and thecost per ton of raw material is significantly less than for agriculturalbiomass. However, the drawback to using wood for chemical production andbiofuels has been, and continues to be, the difficulty and inefficiencyof fractionating wood into its three basic components, namely,cellulose, hemicellulose and lignin. Effective separation would allowthe hemicellulose (a branched and relatively short chain of simplesugars) to be utilized for fermentation into chemical products, insteadof being burned as waste. It is estimated that 60-80% of the cost ofmanufacturing chemical products from agricultural biomass is incurred inseparating fermentable sugars from the starting material. See Ragauskas,A. J. et al., 2006, supra. For forest biomass, with complex linkagesbetween its three components, this percentage is likely larger.Therefore, decreasing the cost and increasing the efficiency ofseparation will have a substantial effect on the economic feasibility ofusing forest biomass for chemical and biofuel production.

The first step toward cost-effective use of forest biomass has been theconceptualization of the integrated forest biorefinery (IFBR) co-locatedwith pulp and paper mills. In such a system, value is maximized bydiverting high-value cellulose to papermaking, in effect subsidizing theseparation cost. The lignin and hemicellulose can then be made availablefor further processing instead of being burned for energy, as iscurrently the case. At present, IFBRs are targeting hardwood as a rawmaterial. This is because softwoods are more extensively cross-linked,making it harder to extract their hemicellulose. However, thepredominant softwood hemicellulose (mannan) is made of more easilyfermentable sugars. A cost-effective method of extracting mannan andxylan from softwoods and hardwoods would yield superior hemicellulosefeedstreams.

Therefore, there is a need to develop innovative, efficient,cost-effective and non-damaging procedures for fractionating woodybiomass (e.g. hardwoods and softwoods) for chemical and fuel production.There is also a need to develop methods for separating woody biomassinto hemicellulose, cellulose and lignin components that are clean andgentle and at conditions that maintain the functionality and downstreamuse of each of these components. This invention answers those needs.

SUMMARY OF THE INVENTION

This invention relates to complementary DNA (cDNA) molecules that encodeisolated polypeptides that specifically cleave non-glycosidic etherbonds between lignins or derivatives thereof and saccharides. Thecleavage of non-glycosidic ether bonds can be between aromatic ornon-aromatic carbons of the lignins or derivatives thereof and thesaccharides.

Examples of the saccharides may include monosaccharides, disaccharides,oligosaccharides, and polysaccharides. Hemicellulose is an example of apolysaccharide.

The isolated polypeptides may include the amino acid sequence of SEQ IDNO:2 and SEQ ID NO:4, which correspond to the sequences derived from thegenomic clone and non-genomic clone (see catalytic fragment as shown inFIG. 5), respectively. The isolated polypeptides have at least about 80%or at least about 90-95% sequence identity to the amino acid sequence ofSEQ ID NO:2 or SEQ ID NO:4. Alternatively, the amino acid sequences ofthe isolated polypeptides can have at least about 96%, at least about97%, at least about 98% or at least about 99% sequence identity to theamino acid sequence of SEQ ID NO:2 or SEQ ID NO: 4.

The isolated polypeptides may also include the amino acid sequence ofSEQ ID NO:50, which correspond to the sequence derived from the genomicclone of XLE. The isolated polypeptides have at least about 80%, atleast about 85%, or at least about 90-95% sequence identity to the aminoacid sequence of SEQ ID NO:50. Alternatively, the amino acid sequencesof the isolated polypeptides can have at least about 96%, at least about97%, at least about 98% or at least about 99% sequence identity to theamino acid sequence of SEQ ID NO:50.

The isolated polypeptides can specifically cleave a non-glycosidic etherbond between a lignin or a derivative thereof and a saccharide. Thesaccharide can be a monosaccharide, a disaccharide, an oligosaccharideor a polysaccharide. The polysaccharide can be a hemicellulose. Theisolated polypeptides are encoded by their respective cDNAs, namely SEQID NOS. 1 and 3, respectively. The isolated polypeptides may include (a)a polypeptide having at least about 80% sequence identity to the maturepolypeptide of SEQ ID NO:2; (b) a polypeptide having at least about90-95% sequence identity to the mature polypeptide of SEQ ID NO:2; (c) apolypeptide encoded by a polynucleotide that hybridizes under medium tohigh stringency conditions with (i) the mature polypeptide codingsequence of SEQ ID NO:1 or (ii) the full length complement of (i); (d) apolypeptide encoded by a polynucleotide having at least about 80%sequence identity to the mature polypeptide coding sequence of SEQ IDNO:1; (e) a polypeptide encoded by a polynucleotide having at leastabout 90-95% sequence identity to the mature polypeptide coding sequenceof SEQ ID NO:1; (f) a variant of the mature polypeptide of SEQ ID NO:2comprising a substitution, deletion and/or insertion at one or severalpositions; and (g) a fragment of the polypeptide of (a), (b), (c) (d) or(e) that specifically cleaves a non-glycosidic ether bond between alignin or a derivative thereof and a saccharide.

Alternatively, the isolated polypeptide is encoded by its cDNA, e.g.,SEQ ID NO 49. The isolated polypeptide may include: (a) a polypeptidehaving at least about 80%, or at least about 85% sequence identity tothe mature polypeptide of SEQ ID NO:50; (b) a polypeptide having atleast about 90-95% sequence identity to the mature polypeptide of SEQ IDNO:50; (c) a polypeptide encoded by a polynucleotide that hybridizesunder medium to high stringency conditions with (i) the maturepolypeptide coding sequence of SEQ ID NO:49 or (ii) the full lengthcomplement of (i); (d) a polypeptide encoded by a polynucleotide havingat least about 80%, or at least about 85% sequence identity to themature polypeptide coding sequence of SEQ ID NO:49; (e) a polypeptideencoded by a polynucleotide having at least about 90-95% sequenceidentity to the mature polypeptide coding sequence of SEQ ID NO:49; (f)a variant of the mature polypeptide of SEQ ID NO:50 comprising asubstitution, deletion and/or insertion at one or several positions; and(g) a fragment of the polypeptide of (a), (b), (c) (d) or (e) thatspecifically cleaves a non-glycosidic ether bond between a lignin or aderivative thereof and a saccharide.

The isolated polypeptide may also include (a) a catalytic domain havingat least about 80% sequence identity to the amino acids of SEQ ID NO:4;(b) a catalytic domain having at least about 90-95% sequence identity tothe amino acids of SEQ ID NO:4; (c) a catalytic domain encoded by apolynucleotide that hybridizes under medium to high stringencyconditions with (i) the nucleotide sequence of SEQ ID NO:3 or (ii) thefull length complement of (i); (d) a catalytic domain encoded by apolynucleotide having at least about 80% sequence identity to thenucleotide sequence of SEQ ID NO:3; and (e) a catalytic domain encodedby a polynucleotide having at least about 90-95% sequence identity tothe nucleotide sequence of SEQ ID NO:3.

The isolated polypeptides may be a mannan:lignin etherase orxylan:lignin etherase. The isolated polypeptide may cleave (a) thenon-glycosidic ether bond between an aromatic carbon of the lignin orthe derivative thereof and the saccharide or (b) the non-glycosidicether bond between a non-aromatic carbon of the lignin or the derivativethereof and the polysaccharide. Examples of non-aromatic carbons of thelignin may include α-linked benzyl carbon or β-linked benzyl carbon.

Also disclosed herein is a method of treating a pulp or biomasscontaining cross-linked lignin-saccharide complexes, which comprisescontacting the pulp or biomass with the isolated polypeptide for asufficient amount of time to allow the polypeptide to break at leastsome of the non-glycosidic ether bonds between lignin-saccharidecomplexes, thereby causing the lignins and saccharides to be releasedfrom the lignin-saccharide complexes in the pulp or biomass withoutsignificant concomitant degradation of the isolated lignins andsaccharides. The method for pulp or biomass treatment may furthercomprise co-incubating concurrently or sequentially the pulp or biomasswith a hemicellulase such that intact hemicellulose is not removed fromthe pulp or biomass. Examples of saccharides as used herein aremonosaccharides, disaccharides, oligosaccharides and polysaccharides. Anexample of a polysaccharide is hemicellulose.

The method for pulp or biomass treatment involves cleavage of thenon-glycosidic ether bond between an aromatic or non-aromatic carbon ofthe lignin or the derivative thereof and the saccharide. Examples ofnon-aromatic carbons of the lignin are α-linked benzyl carbon andβ-linked benzyl carbon.

Another method relates to the identification of an enzyme thatspecifically cleaves a non-glycosidic ether bond between a lignin and asaccharide. The method encompasses (a) providing a fluorogenic ligninanalog that is capable of forming a non-glycosidic ether bond with thesaccharide; (b) derivatizing the fluorescent lignin analog onto thesaccharide via the non-glycosidic ether bond, wherein the formation ofthe non-glycosidic ether bond changes the fluorescent property of thelignin analog; and (c) contacting an enzyme with the ligninanalog-derivatized saccharide, wherein a change in the fluorescentproperty of the lignin analog after contacting indicates that the enzymespecifically cleaves the non-glycosidic ether bond between thelignin-analog and the saccharide. An example of a fluorogenic ligninanalog is 4-methylumbelliferyl acetate (4-MU). The saccharides can bemonosaccharides, disaccharides, oligosaccharides and polysaccharides. Anexample of a polysaccharide is hemicellulose.

Also described herein are nucleic acid constructs or expression vectorsthat include the cDNA molecules encoding the isolated polypeptides ofthe application, wherein the cDNA molecules are operably linked to oneor more control sequences that direct the expression of the polypeptidesin the expression hosts. Examples of the nucleic acid constructs orexpression vectors are selected from the group consisting ofpHIS525-cMLE, pHIS525-cfMLE, pAES40-cMLE, pAES40-cfMLE, pHT43-cMLE,pHT43-cfMLE, pBluescript SK⁻-cMLE, pBluescript SK⁻-cfMLE, pFN6A-cMLE andpFN6A-cfMLE.

Transformed host cells can include the expression vectors that comprisethe cDNA molecules of the application. Examples of the transformed hostcells described in the application are B. megaterium (pHIS525-cMLE), B.subtilis (pHIS525-cMLE), B. megaterium (pHIS525-cfMLE), B. subtilis(pHIS525-cfMLE), E. coli (pAES40-cMLE), E. coli (pAES40-cfMLE), B.subtilis (pHT43-cMLE), B. subtilis (pHT43-cfMLE), E. coli (pBluescriptcMLE), E. coli (pBluescript SK⁻-cfMLE), E. coli (pFN6A-cMLE) and E. coli(pFN6A-cfMLE).

Another feature of the invention is a method of producing heterologouspolypeptides that specifically cleave non-glycosidic ether bonds betweenlignins or derivatives thereof and saccharides. The method involves (a)cultivating the transformed host cells containing the expression vectorsthat comprise the cDNA molecules under conditions conducive for theproduction of the heterologous polypeptides; and (b) recovering theheterologous polypeptides.

Additional aspects, advantages and features of the invention are setforth in this specification, and in part will become apparent to thoseskilled in the art on examination of the following, or may learned bypractice of the invention. The inventions disclosed in this applicationare not limited to any particular set of or combination of aspects,advantages and features. It is contemplated that various combinations ofthe stated aspects, advantages and features make up the inventionsdisclosed in this application.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a gel permeation high pressure liquid chromatography(GP-HPLC) of 4-methylumbelliferyl-locust bean gum (4-MU-LBG) and locustbean gum (LBG).

FIG. 2 shows a decision tree for determining if the enzyme activity issoluble, tethered, energy cofactor-dependent or energycofactor-independent.

FIG. 3 shows the synthesis of benzylated locust bean gum (BLBG).

FIG. 4 shows the scanning electron micrograph of strain B603.

FIG. 5 shows the nucleotide and translated amino acid sequences (SEQ IDNOS:3 and 4, respectively) of mannan:lignin etherase (MLE)-ORF (openreading frame).

FIG. 6 shows the alignment of cDNA open reading frame of mannan:ligninetherase (SEQ ID NO:3) to a glycogen debranching enzyme fromBurkholderia glumae BGR1 (SEQ ID NO:5).

FIGS. 7A-7B show a Southern analysis of putative etherase (MLE) cDNA.FIG. 7A shows ethidium bromide stained gel before transfer. FIG. 7Bshows southern blot probed with biotinylated probe to cDNA from clone17-2.

FIGS. 8A-8C show a predicted gene sequence for the mannan:ligninetherase (MLE) gene (FIG. 8B; SEQ ID NO:1) and its surrounding regions(an upstream sequence (FIG. 8A; SEQ ID NO:6) and a downstream sequence(FIG. 8C; SEQ ID NO:7)) identified using—gene prediction software(FGENES) from Softberry, Inc.

FIG. 9 shows the deduced amino acid sequence for mannan:lignin etherase(MLE; SEQ ID NO:2).

FIG. 10 shows a concentrated culture supernatant before and afterincubation with softwood kraft pulp.

FIG. 11 shows the zymography of E518 culture supernatant. Culturesupernatant from E518 grown for 29 hours in medium containing oligoxylanand benzylated xylan. The culture supernatants were diafiltered andconcentrated using a Sartorius 2 kD cutoff spin column, lyophilized, andredissolved in native gel sample buffer at a final concentration ofabout 150 fold. Duplicate samples were loaded onto wells of a 10% nativegel. One half of the gel was stained with Coomassie Blue G250 and theother half was soaked in HBSS without ammonium nitrate for 10 minutes toexchange the buffer and then blotted dry with filter paper and overlaidwith 2 ml of 4MU-xylan in 0.175% agarose. The gel-overlay sandwich wasincubated for 8 to 10 minutes, and removed. A pieced of PVDF membrane(wetted and equilibrated with HBSS) was applied to the non-overlay sideof the gel for 5 minutes. The PVDF membrane was incubated with 0.1Msodium borate, pH 9.9, for 1 minute, illuminated with short-wave UVlight, and photographed. Lane a shows the Coomassie stain, and Lane b isthe zymograph of the PVDF membrane corresponding to the duplicate lanethat was zymographed.

FIG. 12 shows the DNA sequence (SEQ ID NO: 49) of xylan:lignin etherase(XLE).

FIG. 13 shows the translated amino acid sequence (SEQ ID NO: 50) ofxylan:lignin etherase (XLE).

DETAILED DESCRIPTION OF THE INVENTION

The fractionation of wood biomass into hemicellulose, cellulose andlignin, as described hereinbelow, improves pulp yield and facilitatesthe separation and release of sugars from their raw material sources.These sugars can then be available for fermentation into biofuels andother basic bioproducts.

Ether bonds between lignin and hemicellulose are a primary reason forthe strength of both hardwoods and softwoods and for the difficulty offractionating both types of wood into their component macromolecules.The polypeptides or enzymes, as described herein, can be used in theearly stages of the pulping process to increase the separation oflignin, hemicellulose and cellulose without the concomitant degradationthat occurs with current technology. Because the polypeptides or enzymesdescribed herein will not depolymerize any of the polysaccharides, suchenzyme pretreatment will lead to increased cellulose yield for thepapermaking industry while creating production of separate streams ofhemicellulose and lignin for further processing. The enzyme treatmentcan also be used further downstream (e.g., to brighten paper anddecrease the need for chemical bleaching) reducing associated chemicaland environmental costs for the pulp and paper industry. Besidesimproving the quantity and quality of cellulose for paper production,the ability to separate biomass into three distinct feedstocks isadvantageous in various other ways.

For example, uses for hemicellulose include fermentation into buildingblocks for polymers, fine chemicals and chiral chemicals, or intobiofuels. Alternatively, the hemicelluloses can be used in animal feed.Most hemicellulose in pulp mills is currently extracted with the blackliquor and burned. As a source of heat, hemicelluloses are worth onlyabout $50 per oven-dry metric ton. See van Heiningen, A., Converting akraft pulp mill into an integrated forest biorefinery, Pulp and PaperCanada, 107:38-43 (2006), which is hereby incorporated by reference inits entirety. If they could be efficiently extracted from woodcomponents and used in a biorefinery as feedstock for the production ofethanol and acetic acid, the downstream value of hemicellulose wouldapproach $1,000 per ton. It is estimated that the $5.5 billion U.S. pulpindustry could generate an additional $3.3 billion annually if as few as100 mills were routinely extracting high-grade hemicelluloses. See vanHeiningen, A., 2006, supra.

The remaining lignin can still be burned for heat and energy or sold forsynthesis of aromatic fine chemicals. Until now, uses for lignin in finechemical synthesis have been limited. However, the basic coumarylsubstructure of lignin does lend itself to certain classes of chemicalsyntheses, such as the manufacture of aromatic organic solvents likebenzene and phenol. Polymers based on monolignols are being developed,since they can have unique and potentially useful properties due totheir hydrophobicity.

A final benefit is that lignins and hemicelluloses are also present incellulosic agricultural biomass (like corn stover or wheat grass). Whilethe structure of non-woody biomass is simpler, separation still accountsfor 60-80% of the production cost for a typical cellulosic fermentationproduct. See Ragauskas, A. J. et al., 2006, supra. Application of anenzymatic separation method also has the potential to significantlydecrease costs for agricultural biomass-derived products.

Definitions

All publications and patent applications mentioned in this specificationare herein incorporated by reference in their entirety to the sameextent as if each individual publication or patent application wasspecifically and individually indicated to be incorporated by reference.

Before the invention is disclosed and described in detail, it is to beunderstood that this invention is not limited to particular compounds,configurations, method steps, substrates, and materials disclosed hereinas such compounds, configurations, method steps, substrates, andmaterials may vary somewhat. It is also to be understood that theterminology employed herein is used for the purpose of describingparticular embodiments only and is not intended to be limiting since thescope of the present invention is limited only by the appended claimsand equivalents thereof.

If nothing else is defined, any terms and scientific terminology usedherein are intended to have the meanings commonly understood by those ofskill in the art to which this invention pertains.

The term “about” as used in connection with a numerical value throughoutthe description and the claims denotes an interval of accuracy, familiarand acceptable to a person skilled in the art. Said interval is ±10%.

As used herein and in the appended claims, the singular “a,” “an” and“the” include the plural reference unless the context clearly dictatesotherwise. Thus, for example, reference to a “host cell” includes aplurality of such host cells.

Unless otherwise indicated, nucleic acids are written left to right in5′ to 3′ orientation; amino acid sequences are written left to right inamino to carboxy orientation, respectively.

As used herein, “hemicellulose:lignin etherases” or “HLEs” are a varietyof enzymes that specifically cleave a non-glycosidic ether bond betweena lignin or a derivative thereof and a saccharide. The cleavage ofnon-glycosidic ether bond by HLE may be between an aromatic or anon-aromatic carbon of the lignin or the derivative thereof and thesaccharide. HLEs can specifically and gently loosen the lignin away fromhemicellulose without significant concomitant degradation. Examples ofHLEs may include but are not limited to mannan:lignin etherase (“MLE”)and xylan:lignin etherase (“XLE”). HLEs differ from traditional pulpingenzyme in that these enzymes break the non-glycosidic ether bondsbetween lignin and hemicellulose, which are sites oflignin-hemicellulose crosslinks. The end result of HLE action is anintact hemicellulose that can be used either intact or depolymerized byhemicellulases into sugars. Bonds between sugars and lignin are broken,increasing sugar yields and giving purer lignin fractions.

Traditional pulping enzymes (e.g., cellulases and hemicellulases) areglycosidases that only break the glycosidic bonds between the sugars inthe hemicellulose and do not break lignin-hemicellulose bonds. In doingthis, the hemicellulose structure is destroyed leaving some sugars stillattached to the lignin where they are wasted. Under this scenario, thepulp or biomass can be incubated with HLE and a hemicellulase and/or acellulase, either concurrently or sequentially, to break many of thenon-glycosidic bonds between lignin and hemicellulose and most of theglycosidic bonds between sugars.

As used herein, a “mannan:lignin etherase” or “MLE” is a polypeptidethat specifically targets mannan, the major hemicellulose of softwoods.An MLE polypeptide is a type of HLE that specifically breaks or cleavesnon-glycosidic ether bonds between lignin and mannan. The modelsubstrate is a mannan that has been derivatized with a lignin monomeranalog (e.g., 4-methylumbelliferone or 4-MU) at some of the C6 residuesof mannan.

Microorganisms that can Cleave Phenyl Ether Bonds Between4-Methylumbeliferone and C₆ of Residues of Mannan Will GenerateFlorescence

As used herein, a “xylan:lignin etherase” or “XLE” is a polypeptide thatspecifically cleaves or breaks the phenyl ether bonds between lignin andxylan. The model substrate is xylan that has been derivatized with alignin monomer analog, 4-methylumbelliferone (4-MU) at some of the 2′and 3′ hydroxyl groups.

As discussed above, hemicellulose is linked to lignin by ether bonds andMLE and XLE are enzymatically targeted to the non-glycosidic ether bondsat the aromatic carbon of lignins. However, hemicellulose can also belinked to lignin via non-glycosidic ether bonds at its non-aromaticcarbon bonds, e.g., the α- and β-benzyl carbon bonds of lignin. Inanother embodiment of the invention, hemicellulose:lignin etherases(HLEs) may also include enzymes that break the α- and β-benzyl etherbonds between mannose and lignin. Examples of mannose bonded to a ligninmonomer via α- and β-benzyl bonds are provided hereinbelow:

Examples of softwood include Araucaria (e.g. A. cunninghamii, A.angustifolia, A. araucana); softwood Cedar (e.g. Juniperus virginiana,Thuja plicata, Thuja occidentalis, Chamaecyparis thyoides, Callitropsisnootkatensis); Cypress (e.g. Chamaecyparis, Cupressus Taxodium,Cupressus arizonica, Taxodium distichum, Chamaecyparis obtusa,Chamaecyparis lawsoniana, Cupressus semperviren); European Yew; Fir(e.g. Abies balsamea, Abies alba, Abies procera, Abies amabilis);Hemlock (e.g. Tsuga canadensis, Tsuga mertensiana, Tsuga heterophylla,Tsuga heterotallica), Douglas fir (Pseudotsuga menzisii), Kauri; Kaya;Larch (e.g. Larix decidua, Larix kaempferi, Larix laricina, Larixoccidentalis); Pine (e.g. Pinus nigra, Pinus banksiana, Pinus contorta,Pinus radiata, Pinus ponderosa, Pinus resinosa, Pinus sylvestris, Pinusstrobus, Pinus monticola, Pinus lambertiana, Pinus taeda, Pinuspalustris, Pinus rigida, Pinus echinate, Pinus halepensis, Pinuselliotti, Pinus caribiae); Redwood (Sequoia sempervirens); Rimu; Spruce(e.g. Picea abies, Picea mariana, Picea rubens, Picea sitchensis, Piceaglauca); Sugi; and combinations/hybrids thereof.

Examples of hardwood include Acacia (e.g. Acacia melanoxylon, Acaciahomalophylla, Acacia magnium); Afzelia; Synsepalum duloificum; Albizia;Alder (e.g. Alnus glutinosa, Alnus rubra); Applewood; Arbutus; Ash (e.g.F. nigra, F. quadrangulata, F. excelsior, F. pennsylvanica lanceolata,F. latifolia, F. profunda, F. americana); Aspen (e.g. P. grandidentata,P. tremula, P. tremuloides); Australian Red Cedar (Toona ciliata); Ayna(Distemonanthus benthamianus); Balsa (Ochroma pyramidale); Basswood(e.g. T. americana, T. heterophyllal); Beech (e.g. F. sylvatica, F.grandifolia); Birch; (e.g. Betula populifolia, B. nigra, B. papyrifera,B. lenta, B. alleghaniensis/B. lutea, B. pendula, B. pubescens);Blackbean; Blackwood; Bocote; Boxelder; Boxwood; Brazilwood; Bubinga;Buckeye (e.g. Aesculus hippocastanum, Aesculus glabra, Aesculusflava/Aesculus octandra); Butternut; Catalpa; Cherry (e.g. Prunusserotina, Prunus pennsylvanica, Prunus avium); Crabwood; Chestnut;Coachwood; Cocobolo; Corkwood; Cottonwood (e.g. Populus balsamifera,Populus deltoides, Populus sargentii, Populus heterophylla);Cucumbertree; Dogwood (e.g. Cornus florida, Cornus nuttallii); Ebony(e.g. Diospyros kurzii, Diospyros melanida, Diospyros crassiflora); Elm(e.g. Ulmus americana, Ulmus procera, Ulmus thomasii, Ulmus rubra, Ulmusglabra); Eucalyptus (e.g. Eucalyptus grandis, Eucalyptus urograndis, andEucalyptus globulus); Greenheart; Grenadilla; Gum (e.g. Nyssa sylvatica,Eucalyptus globulus, Liquidambar styraciflua, Nyssa aquatica); Hickory(e.g. Carya alba, Carya glabra, Carya ovata, Carya laciniosa); Hornbeam;Hophornbeam; Ip; Iroko; Ironwood (e.g. Bangkirai, Carpinus caroliniana,Casuarina equisetifolia, Choricbangarpia subargentea, Copaifera spp.,Eusideroxylon zwageri, Guajacum officinale, Guajacum sanctum, Hopeaodorata, Ipe, Krugiodendron ferreum, Lyonothamnus lyonii (L.floribundus), Mesua ferrea, Olea spp., Olneya tesota, Ostrya virginiana,Parrotia persica, Tabebuia serratifolia); Jacaranda (Jacarandaacutifolia); Jotoba; Lacewood; Laurel; Limba; Lignum vitae; Locust (e.g.Robinia pseudacacia, Gleditsia triacanthos); Mahogany; Maple (e.g. Acersaccharum, Acer nigrum, Acer negundo, Acer rubrum, Acer saccharinum,Acer pseudoplatanus, Acer campestre, Acer platanoides); Meranti; Mpingo;Oak (e.g. Quercus macrocarpa, Quercus alba, Quercus stellata, Quercusbicolor, Quercus virginiana, Quercus michauxii, Quercus prinus, Quercusmuhlenbergii, Quercus chrysolepis, Quercus lyrata, Quercus robur,Quercus petraea, Quercus rubra, Quercus velutina, Quercus laurifolia,Quercus falcata, Quercus nigra, Quercus phellos, Quercus texana);Obeche; Okoume; Oregon Myrtle; California Bay Laurel; Pear; Poplar (e.g.P. balsamifera, P. nigra, Populus balsamifera, P. fremontii and P. nigraHybrid Poplar (Populus×canadensi)); Ramin; Red cedar; Rosewood; Sal;Sandalwood; Sassafras; Satinwood; Silky Oak; Silver Wattle; Snakewood;Sourwood; Spanish cedar; American sycamore; Teak; Walnut (e.g. Juglansnigra, Juglans regia); Willow (e.g. Salix nigra, Salix alba); Yellowpoplar (Liriodendron tulipifera); Bamboo; Palmwood; andcombinations/hybrids thereof.

“Lignin” is a polyphenolic material comprised of methoxylated phenylpropane units linked by ether and carbon-carbon bonds. Lignins can behighly branched and can also be crosslinked. Lignins can havesignificant structural variation that depends, at least in part, on theplant source involved. Lignin fills spaces in the cell wall and betweencellulose, hemicellulose, and, if present, pectin components.

As used herein, the term “native lignin” refers to lignin in its naturalstate, in plant material.

Native lignin is a naturally occurring amorphous complex cross-linkedorganic macromolecule that comprises an integral component of all plantbiomass. The chemical structure of lignin is irregular in the sense thatdifferent structural units (e.g., phenylpropane units) are not linked toeach other in any systematic order. Extracting native lignin fromlignocellulosic biomass during pulping generally results in ligninfragmentation into numerous mixtures of irregular components.Furthermore, the lignin fragments may react with any chemicals employedin the pulping process. Consequently, the generated lignin fractions canbe referred to as lignin derivatives and/or technical lignins. As it isdifficult to elucidate and characterize such complex mixture ofmolecules, lignin derivatives are usually described in terms of thelignocellulosic plant material used, and the methods by which they aregenerated and recovered from lignocellulosic plant material, i.e.,hardwood lignins, softwood lignins, and annual fiber lignins.

Native lignins are partially depolymerized during the pulping processesinto lignin fragments which dissolve in the pulping liquors and aresubsequently separated from the cellulosic pulps. Post-pulping liquorscontaining lignin and polysaccharide fragments, and other extractivesare commonly referred to as “black liquors” or “spent liquors,”depending on the pulping process. Such liquors are generally considereda by-product, and it is common practice to combust them to recover someenergy value in addition to recovering the cooking chemicals. However,it is also possible to precipitate and/or recover lignin derivativesfrom these liquors. Each type of pulping process used to separatecellulosic pulps from other lignocellulosic components produces ligninderivatives that are very different in their physico-chemical,biochemical, and structural properties.

As used herein, the terms “lignin derivatives” and “derivatives ofnative lignin” refer to lignin material extracted from lignocellulosicbiomass. Usually, such material will be a mixture of chemical compoundsthat are generated during the extraction process. A lignin derivativemay include a lignin mimic.

A “lignin mimic” can refer to a compound, either chemically synthesizedor in its natural form, that is capable of mimicking the conformationand desirable features of a natural lignin.

The term “hemicellulose” can refer to polysaccharides comprising mainlysugars or combinations of sugars (e.g., xylose). Hemicellulose can behighly branched. Hemicellulose can be chemically bonded to lignin andcan further be randomly acetylated, which can reduce enzymatichydrolysis of the glycosidic bonds in hemicellulose. See Samuel, R. etal., Structural changes in switchgrass lignin and hemicelluloses duringpretreatments by NMR analysis, Polym. Degrad. Stabil., 96(11):2002-2009,(2011), which is hereby incorporated by reference in its entirety.Examples of a hemicellulose include but are not limited to xyloglucan,xylan, mannan, galactomannan, arabinoglucuronoxylan, glucuronoxylan,glucomannan and galactoglucomannan. In one embodiment, the hemicelluloseis at least one selected from the group consisting of xylan,arabinoglucuronoxylan, glucuronoxylan, glucomannan, galactomannan andgalactoglucomannan.

“Hemicellulose derivative” refers to a structural component of plantcell walls other than cellulose and lignin, or a derivative thereof.Hemicelluloses are heterogeneous and vary depending on the origin of theplant material, but the most commonly found components include xylans,glucomannans, galactans, glucans, and xyloglucans. Thus, uponhydrolysis, hemicellulose may yield glucose, galactose, mannose, xylose,arabinose and/or derivatives thereof.

“Saccharide” refers to monomeric, dimeric oligomeric, or polymericaldose and ketose carbohydrates. Monosaccharides are simple sugars withmultiple hydroxyl groups and exist preferably as cyclic hemiacetals andhemiketals but may also exist in acyclic forms. Stereoisomers of cyclicmonosaccharides can exist in α- or β forms and in D- or L-forms.Disaccharides are two monosaccharides that are covalently linked by aglycosidic bond. Saccharides are also found in modified form, either asnatural products or as a result of chemical modification duringhydrolysis or industrial processing. Saccharide derivatives includethose modified by deoxygenation or addition of moieties such as acetyl,amino, or methyl groups. In oligosaccharides and polysaccharides,saccharide monomers are connected by characteristic glycosidic linkages,e.g., β1-4, α1-6, α1-2, α1-3, or β1-2. In some polymers, such ascellulose, the linkages are uniform throughout the polymer, while inothers, primarily hemicellulosic materials, the linkages may be mixed.Short (typically 1-3 saccharides) branched side chains may also bepresent in polysaccharides, typically from hemicellulose.

The term “polysaccharide” is used herein to denote polymericcarbohydrate structure form of monosaccharides joined together byglycosidic bonds. A “heteropolysaccharide” is a polysaccharide with twoor more different monosaccharide units. A “homopolysaccharide” is apolysaccharide with one type of monosaccharide unit. “Hemicellulose” isa cell wall polysaccharide of land plants with an amorphous structure.“Wood hemicellulose” is a polysaccharide found in softwoods (conifers)and hardwoods (eudicotyledons).

“Arabinose” refers to the monosaccharide arabinopentose and itsderivatives, occurring primarily as L-arabinofuranose in xylans andxyloglucans.

“Galactose” refers to the monosaccharide galacto-hexose and itsderivatives, occurring primarily as D-galactopyranose in xylans andglucomannans.

“Glucose” refers to the monosaccharide gluco-hexose and its derivatives,occurring primarily as D-glucopyranose in cellulose, glucomannans, andxyloglucans.

“Hexose” refers to C6 sugars and their derivatives, which may occur inpyranose or furanose form. The hexoses most commonly found in plantmaterial are glucose, galactose, and mannose.

“Mannose” refers to manno-hexose and its derivatives, occurringprimarily as D-mannopyranose in glucomannans.

“Pentose” refers to C5 sugars and their derivatives, which may occur inpyranose or furanose form. The pentoses most commonly found in plantmaterial are arabinose and xylose.

“C6 and/or C5 sugar” refers to monosaccharides including, for example,hexose (“C6”) sugars (e.g., aldohexoses such as glucose, mannose,galactose, gulose, idose, talose, aldohexose, allose altrose; andketohexoses such as psicose, fructose, sorbose, tagatose; or others,singly or in any combinations thereof), and/or pentose (“C5”) sugars(e.g., aldopentoses such as xylose, arabinose, ribose, lyxose;ketopentoses such as ribulose, xylulose; and others, singly or in anycombinations thereof). Hexose is a monosaccharide with six carbon atoms,having the chemical formula C₆H₁₂O₆. Hexoses can be classified, forexample, by a functional group, with aldohexoses having an aldehydefunctional group at position 1, and ketohexoses having a ketonefunctional group at position 2. As known, 6-carbon aldose sugars canform cyclic hemiacetals, which can include a pyranose structure. Insolution, open-chain forms and cyclic forms of 6-carbon aldose sugarscan exist in equilibrium, or be present in other relative fractions toeach other. Pentose is a monosaccharide with five carbon atoms, havingthe chemical formula C₅H₁₀O₅. Pentose can be classified, for example,into two groups, with aldopentoses having an aldehyde functional groupat position 1, and ketopentoses having a ketone functional group atposition 2. As known, 5-carbon aldose sugars also can have cyclichemiacetal forms, which can include a furanose structure or a pyranosestructure. The hemiacetal cyclic forms of 5-carbon aldose sugars mayspontaneously open and close, wherein mutarotation may occur.

“Xylose” refers to xylo-pentose and its derivatives, occurring primarilyas D-xylopyranose in xylans and xyloglucans.

The terms “glycosidic bond” and “glycosidic linkage” refer to a linkagebetween the hemiacetal group of one saccharide unit and the hydroxylgroup of another saccharide unit.

Saccharification is the process of hydrolyzing polymers of the sourcematerial, such as cellulose and hemicellulose, or starch, intofermentable mono- and di-saccharides such as cellobiose, glucose,xylose, arabinose, mannose, and galactose. For cellulosicpolysaccharides, methods for saccharification include autohydrolysis,acid hydrolysis, and enzymatic hydrolysis. Saccharification andvariations thereof refer to the process of converting polysaccharides(e.g., hemicellulose) to fermentable sugars, e.g., through thehydrolysis of glycosidic bonds. Saccharification can be effected withenzymes or chemicals. Enzymes, such as hemicellulases can be added tobiomass directly (e.g., as a solid or liquid enzyme additive) or can beproduced in situ by microbes (e.g., yeasts, fungi, bacteria, etc.).Saccharification products include, for example, fermentable sugars, suchas glucose and other small (low molecular weight) oligosaccharides suchas monosaccharides, disaccharides, and trisaccharides.

“Suitable conditions” for saccharification refer to various conditionsknown to one of skill in the art including pH, temperature, biomasscomposition, and enzyme composition.

“Fermentation” refers to the biological conversion of a carbon sourceinto a bioproduct by a microorganism. Fermentation may be aerobic oranaerobic. Anaerobic fermentation takes place in a medium or atmospheresubstantially free of molecular oxygen.

The term “enzyme” refers to a protein that catalyzes a chemicalreaction. In particular, enzymes may include those polypeptides that canspecifically cleave or break bonds between saccharides or sugars andlignins at non-glycosidic positions. More particularly, enzymes mayinclude the polypeptides that cleave a non-glycosidic ether bond betweena lignin or a derivative thereof and a saccharide. The cleavage of thenon-glycosidic ether bond can be between an aromatic or non-aromaticcarbon of the lignin or the derivative thereof and the saccharide.

The term “catalytic domain” means the region of an enzyme containing thecatalytic machinery of the enzyme. In one embodiment, the catalyticdomain comprises amino acids 509-702 of SEQ ID NO:2 or the amino acidsof SEQ ID NO:4 having hemicellulose:lignin etherase (HLE) activity.

The term “subsequence” means a polynucleotide having one or more (e.g.,several) nucleotides absent from the 5′ and/or 3′ end of a maturepolypeptide coding sequence; wherein the subsequence encodes a catalyticfragment having hemicellulose:lignin etherase (HLE) activity. In oneaspect, a subsequence contains at least 585 nucleotides (e.g.,nucleotides 1525-2109 of SEQ ID NO:1 and 1-585 of SEQ ID NO:3).

The term “variant” means a polypeptide having hemicellulose:ligninetherase (HLE) activity comprising an alteration, i.e., a substitution,insertion, and/or deletion, at one or more (e.g., several) positions. Asubstitution means replacement of the amino acid occupying a positionwith a different amino acid; a deletion means removal of the amino acidoccupying a position; and an insertion means adding an amino acidadjacent to and immediately following the amino acid occupying aposition. An example of a variant includes the amino acids of SEQ IDNO:4 having hemicellulose:lignin etherase (HLE) activity.

The term “cDNA” means a DNA molecule that can be prepared by reversetranscription from a mature, spliced, mRNA molecule obtained from aeukaryotic or prokaryotic cell. A cDNA lacks intron sequences that maybe present in the corresponding genomic DNA. The initial, primary RNAtranscript is a precursor to mRNA that is processed through a series ofsteps, including splicing, before appearing as mature spliced mRNA. AcDNA, according to the embodiment of the invention, encodes apolypeptide that cleaves a non-glycosidic ether bond between a lignin ora derivative thereof and a saccharide. The cleavage of thenon-glycosidic ether bond can be between an aromatic or non-aromaticcarbon of the lignin or the derivative thereof and the saccharide. Inone embodiment, a cDNA encompasses a nucleotide sequence of SEQ ID NO:1or SEQ ID NO:3.

In one embodiment, a cDNA encompasses a nucleotide sequence of SEQ IDNO:49.

The term “coding sequence” means a polynucleotide which directlyspecifies the amino acid sequence of a polypeptide. The boundaries ofthe coding sequence are generally determined by an open reading frame,which begins with a start codon such as ATG, GTG, or TTG and ends with astop codon such as TAA, TAG, or TGA. The coding sequence may be agenomic DNA, cDNA, synthetic DNA, or a combination thereof.

The term “control sequences” means nucleic acid sequences necessary forexpression of a polynucleotide encoding a mature polypeptide of thepresent invention. Each control sequence may be native (i.e., from thesame gene) or foreign (i.e., from a different gene) to thepolynucleotide encoding the polypeptide or native or foreign to eachother. Such control sequences include, but are not limited to, a leader,polyadenylation sequence, propeptide sequence, promoter, signal peptidesequence, and transcription terminator. At a minimum, the controlsequences include a promoter, and transcriptional and translational stopsignals. The control sequences may be provided with linkers for thepurpose of introducing specific restriction sites facilitating ligationof the control sequences with the coding region of the polynucleotideencoding a polypeptide.

The control sequence may also be an appropriate promoter sequence whichis recognized by a host cell for expression of the isolatedpolynucleotide sequence of the present invention. The promoter sequencecontains transcriptional control sequences which mediate the expressionof the HLEs. The promoter may be any nucleic acid sequence which showstranscriptional activity in the host cell of choice including mutant,truncated, and hybrid promoters, and may be obtained from genes encodingextracellular or intracellular polypeptides either homologous orheterologous to the host cell.

The term “expression” includes any step involved in the production of apolypeptide including, but not limited to, transcription,post-transcriptional modification, translation, post-translationalmodification, and secretion.

The term “expression vector” means a linear or circular DNA moleculethat comprises a polynucleotide encoding a polypeptide and is operablylinked to control sequences that provide for its expression. Examplesexpression vectors include but are not limited to pHIS525-cMLE,pHIS525-cfMLE, pAES40-cMLE, pAES40-cfMLE, pHT43-cMLE, pHT43-cfMLE,pBluescript SK⁻-cMLE, pBluescript SK⁻-cfMLE, pFN6A-cMLE and pFN6A-cfMLE.

The term “host cell” means any cell type that is susceptible totransformation, transfection, transduction, or the like with a nucleicacid construct or expression vector comprising the isolatedpolynucleotide of the present invention. The term “host cell”encompasses any progeny of a parent cell that is not identical to theparent cell due to mutations that occur during replication. Examples ofhost cells include but are not limited to Escherichia coli, Bacillusmegaterium, and Bacillus subtilis.

Recombinant host cells, according to the embodiment of the inventioncomprise a complementary DNA (cDNA) sequence encoding a polypeptide thatcleaves a non-glycosidic ether bond between a lignin or a derivativethereof and a saccharide.

The term “purified” or “isolated,” in relation to an enzyme or nucleicacid, indicates the enzyme or nucleic acid is not in its natural mediumor form. The term “isolated” thus includes an enzyme or nucleic acidremoved from its original environment, e.g., the natural environment ifit is naturally occurring. For instance, an isolated enzyme is typicallydevoid of at least some proteins or other constituents of the cells towhich it is normally associated or with which it is normally admixed orin solution. An isolated enzyme includes said enzyme naturally-producedcontained in a cell lysate or secreted into a culture supernatant; theenzyme in a purified or partially purified form, the recombinant enzyme,the enzyme which is expressed or secreted by a bacterium, as well as theenzyme in a heterologous host cell or culture. In relation to a nucleicacid, the term isolated or purified indicates e.g., that the nucleicacid is not in its natural genomic context (e.g., in a vector, as anexpression cassette, linked to a promoter, or artificially introduced ina heterologous host cell).

As used herein, “heterologous” in reference to a nucleic acid (cDNA orpolynucleotide) or protein (polypeptide) includes a molecule that hasbeen manipulated by human intervention so that it is located in a placeother than the place in which it is naturally found. For example, anucleic acid sequence from one organism (e.g. from one strain orspecies) may be introduced into the genome of another organism (e.g. ofanother strain or species), or a nucleic acid sequence from one genomiclocus may be moved to another genomic or extrachromosomal locus in thesame organism. A heterologous protein includes, for example, a proteinexpressed from a heterologous coding sequence or a protein expressedfrom a recombinant gene in a cell that would not naturally express theprotein.

The term “nucleic acid construct” means a nucleic acid molecule, eithersingle- or double-stranded, which is isolated from a naturally occurringgene or is modified to contain segments of nucleic acids in a mannerthat would not otherwise exist in nature or which is synthetic thatcomprises one or more control sequences.

The term “operably linked” means a configuration in which a controlsequence is placed at an appropriate position relative to the codingsequence of a polynucleotide such that the control sequence directsexpression of the coding sequence.

As used herein, “identity” and “percent identity,” in the context of twoor more polypeptide sequences, refers to two or more sequences orsubsequences that are the same or have a specified percentage of aminoacid residues that are the same (e.g., share at least about 70%, atleast about 75%, at least about 80%, at least about 85%, at least about88% identity, at least about 89%, at least about 90%, at least about91%, at least about 92%, at least about 93%, at least about 94%, atleast about 95%, at least about 96%, at least about 97%, at least about98%, or at least about 99% identity) over a specified region to areference sequence, when compared and aligned for maximum correspondenceover a comparison window, or designated region as measured using asequence comparison algorithms or by manual alignment and visualinspection.

In some embodiments, the terms “percent identity,” “% identity,”“percent identical,” and “% identical” are used interchangeably hereinto refer to the percent amino acid or polynucleotide sequence identitythat is obtained by ClustalW analysis (version W 1.8 available fromEuropean Bioinformatics Institute, Cambridge, UK) or by Clustal Omegaanalysis (see Sievers F., et al., Fast, scalable generation ofhigh-quality protein multiple sequence alignments using Clustal Omega,Mol. Syst. Biol., 7(539):1-6 (2011), which is incorporated herein byreference in its entirety) that is available from University CollegeDublin (Dublin, Ireland), counting the number of identical matches inthe alignment and dividing such number of identical matches by thelength of the reference sequence, and using the following ClustalWparameters to achieve slow/more accurate pairwise optimalalignments—DNA/Protein Gap Open Penalty: 15/10; DNA/Protein GapExtension Penalty: 6.66/0.1; Protein weight matrix: Gonnet series; DNAweight matrix: Identity.

Two sequences are “aligned” when they are aligned for similarity scoringusing a defined amino acid substitution matrix (e.g., BLOSUM62), gapexistence penalty and gap extension penalty so as to arrive at thehighest score possible for that pair of sequences. Amino acidsubstitution matrices and their use in quantifying the similaritybetween two sequences are well known in the art. See, e.g., Dayhoff etal., in Dayhoff (ed.), Atlas of Protein Sequence and Structure,” Vol. 5,Suppl. 3, Natl. Biomed. Res. Round., Washington, D.C. (1978); pp.345-352; and Henikoff, S. and Henikoff, J. G., Proc. Natl. Acad. Sci.USA, 89:10915-10919 (1992), both of which are incorporated herein byreference in their entirety). The BLOSUM62 matrix is often used as adefault scoring substitution matrix in sequence alignment protocols suchas Gapped BLAST 2.0. The gap existence penalty is imposed for theintroduction of a single amino acid gap in one of the aligned sequences,and the gap extension penalty is imposed for each additional empty aminoacid position inserted into an already opened gap. The alignment isdefined by the amino acid position of each sequence at which thealignment begins and ends, and optionally by the insertion of a gap ormultiple gaps in one or both sequences so as to arrive at the highestpossible score. While optimal alignment and scoring can be accomplishedmanually, the process is facilitated by the use of acomputer-implemented alignment algorithm (e.g., gapped BLAST 2.0; See,Altschul et al., Nucleic Acids Res., 25:3389-3402 (1997), which isincorporated herein by reference in its entirety, and made available tothe public at the National Center for Biotechnology InformationWebsite). Optimal alignments, including multiple alignments can beprepared using readily available programs such as PSI-BLAST (See e.g.,Altschul et al., 1997, supra).

The present invention also provides a recombinant nucleic acid constructcomprising a polynucleotide sequence that hybridizes under stringenthybridization conditions to the complement of a polynucleotide whichencodes a polypeptide having the amino acid sequence of SEQ ID NO:2and/or 4.

The present invention also provides a recombinant nucleic acid constructcomprising a polynucleotide sequence that hybridizes under stringenthybridization conditions to the complement of a polynucleotide whichencodes a polypeptide having the amino acid sequence of SEQ ID NO:50.

Two nucleic acid or polypeptide sequences that have 100% sequenceidentity are said to be “identical.” A nucleic acid or polypeptidesequence is said to have “substantial sequence identity” to a referencesequence when the sequences have at least about 70%, at least about 75%,at least about 80%, at least about 85%, at least about 90%, at leastabout 91%, at least about 92%, at least about 93%, at least about 94%,at least about 95%, at least about 96%, at least about 97%, at leastabout 98%, or at least about 99%, or greater sequence identity asdetermined using the methods described herein, such as BLAST usingstandard parameters.

As used herein, a “secretion signal peptide” can be a propeptide, aprepeptide or both. For example, the term “propeptide” refers to aprotein precursor that is cleaved to yield a “mature protein.” Thesignal peptide is cleaved from the pre-protein by a signal peptidaseprior to secretion to result in the “mature” or “secreted” protein. Theterms “prepeptide” and “pre-protein” refer to a polypeptide synthesizedwith an N-terminal signal peptide that targets it for secretion.Accordingly, a “pre-pro-peptide” is a polypeptide that contains a signalpeptide that targets the polypeptide for secretion and which is cleavedoff to yield a mature polypeptide. Signal peptides can be found at theN-terminus of the protein and may typically compose of between 6 to 136basic and hydrophobic amino acids.

The term “mature polypeptide” means a polypeptide having HLE activity orcapable of specifically cleaving a non-glycosidic ether bond between alignin or a derivative thereof and a polysaccharide in its final formfollowing translation and any post-translational modifications. It isknown in the art that a host cell may produce a mixture of two of moredifferent mature polypeptides (i.e., with a different C-terminal and/orN-terminal amino acid) expressed by the same polynucleotide. The maturepolypeptide can be predicted using the SignalP program. See Nielsen etal., Protein Engineering 10:1-6 (1997), which is hereby incorporated byreference in its entirety.

The term “mature polypeptide coding sequence” is defined herein as anucleotide sequence that encodes a mature polypeptide having HLEactivity or capable of specifically cleaving a non-glycosidic ether bondbetween a lignin or a derivative thereof and a saccharide. The maturepolypeptide coding sequence can be predicted using the SignalP program.See Nielsen et al., 1997, supra.

The term “very high stringency conditions” means for probes of at least100 nucleotides in length, prehybridization and hybridization at 42° C.in 5×SSPE, 0.3% SDS, 200 micrograms/ml sheared and denatured salmonsperm DNA, and 50% formamide, following standard Southern blottingprocedures for 12 to 24 hours. The carrier material is finally washedthree times each for 15 minutes using 2×SSC, 0.2% SDS at 70° C.

The term “high stringency conditions” means for probes of at least 100nucleotides in length, prehybridization and hybridization at 42° C. in5×SSPE, 0.3% SDS, 200 micrograms/ml sheared and denatured salmon spermDNA, and 50% formamide, following standard Southern blotting proceduresfor 12 to 24 hours. The carrier material is finally washed three timeseach for 15 minutes using 2×SSC, 0.2% SDS at 65° C.

The term “medium-high stringency conditions” means for probes of atleast 100 nucleotides in length, prehybridization and hybridization at42° C. in 5×SSPE, 0.3% SDS, 200 micrograms/ml sheared and denaturedsalmon sperm DNA, and 35% formamide, following standard Southernblotting procedures for 12 to 24 hours. The carrier material is finallywashed three times each for 15 minutes using 2×SSC, 0.2% SDS at 60° C.

The term “medium stringency conditions” means for probes of at least 100nucleotides in length, prehybridization and hybridization at 42° C. in5×SSPE, 0.3% SDS, 200 micrograms/ml sheared and denatured salmon spermDNA, and 35% formamide, following standard Southern blotting proceduresfor 12 to 24 hours. The carrier material is finally washed three timeseach for 15 minutes using 2×SSC, 0.2% SDS at 55° C.

The term “low stringency conditions” means for probes of at least 100nucleotides in length, prehybridization and hybridization at 42° C. in5×SSPE, 0.3% SDS, 200 micrograms/ml sheared and denatured salmon spermDNA, and 25% formamide, following standard Southern blotting proceduresfor 12 to 24 hours. The carrier material is finally washed three timeseach for 15 minutes using 2×SSC, 0.2% SDS at 50° C.

The term “very low stringency conditions” means for probes of at least100 nucleotides in length, prehybridization and hybridization at 42° C.in 5×SSPE, 0.3% SDS, 200 micrograms/ml sheared and denatured salmonsperm DNA, and 25% formamide, following standard Southern blottingprocedures for 12 to 24 hours. The carrier material is finally washedthree times each for 15 minutes using 2×SSC, 0.2% SDS at 45° C.

In an embodiment, the present invention relates to isolated polypeptideshaving a sequence identity to the polypeptide of SEQ ID NO:2 or SEQ IDNO:4 or an allelic variant thereof or a fragment thereof, of at leastabout 70%, at least about 75%, at least about 80%, at least about 81%,at least about 82%, at least about 83%, at least about 84%, at leastabout 85%, at least about 86%, at least about 87%, at least about 88%,at least about 89%, at least about 90%, at least about 91%, at leastabout 92%, at least about 93%, at least about 94%, at least about 95%,at least about 96%, at least about 97%, at least about 98%, at leastabout 99%, or 100%; which have HLE activity or are capable ofspecifically cleaving a non-glycosidic ether bond between a lignin or aderivative thereof and a saccharide or a derivative thereof (such as anacetylated saccharide). In one aspect, the polypeptides differ by up to10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 from thepolypeptide of SEQ ID NO:2, or SEQ ID NO:4.

In an embodiment, the present invention relates to isolated polypeptideshaving a sequence identity to the polypeptide of SEQ ID NO:50 or anallelic variant thereof or a fragment thereof, of at least about 70%, atleast about 75%, at least about 80%, at least about 81%, at least about82%, at least about 83%, at least about 84%, at least about 85%, atleast about 86%, at least about 87%, at least about 88%, at least about89%, at least about 90%, at least about 91%, at least about 92%, atleast about 93%, at least about 94%, at least about 95%, at least about96%, at least about 97%, at least about 98%, at least about 99%, or100%; which have HLE activity or are capable of specifically cleaving anon-glycosidic ether bond between a lignin or a derivative thereof and asaccharide or a derivative thereof (such as an acetylated saccharide).In one embodiment, the polypeptides differ by up to 10 amino acids,e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the polypeptide of SEQ IDNO:50.

An isolated polypeptide of the present invention preferably comprises orconsists of the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4 or anallelic variant thereof; or a fragment thereof having HLE activity orcapable of specifically cleaving a non-glycosidic ether bond between alignin or a derivative thereof and a saccharide. In another aspect, thepolypeptide comprises or consists of the mature polypeptide of SEQ IDNO:2 or SEQ ID NO:4. In another aspect, the isolated polypeptidecomprises or consists of amino acids 509 to 702 of SEQ ID NO:2. Inanother aspect, the polypeptide comprises or consists of amino acids ofSEQ ID NO:4. In another embodiment, the present invention relates toisolated polypeptides having HLE activity that are encoded bypolynucleotides that hybridize under very low stringency conditions, lowstringency conditions, medium stringency conditions, medium-highstringency conditions, high stringency conditions, or very highstringency conditions with (i) the polypeptide coding sequence of SEQ IDNO:1 or the cDNA sequence thereof, the mature polypeptide codingsequence of SEQ ID NO:3 or the cDNA sequence thereof, or (ii) thefull-length complement of (i). See Sambrook et al., 1989, MolecularCloning, A Laboratory Manual, 2d edition, Cold Spring Harbor, N.Y.,which is hereby incorporated by reference in its entirety.

An isolated polypeptide of the present invention preferably comprises orconsists of the amino acid sequence of SEQ ID NO:50 or an allelicvariant thereof; or a fragment thereof having HLE activity or capable ofspecifically cleaving a non-glycosidic ether bond between a lignin or aderivative thereof and a saccharide. Alternatively, the polypeptidecomprises or consists of the mature polypeptide of SEQ ID NO:50. Inanother embodiment, the present invention relates to isolatedpolypeptides having HLE activity that are encoded by polynucleotidesthat hybridize under very low stringency conditions, low stringencyconditions, medium stringency conditions, medium-high stringencyconditions, high stringency conditions, or very high stringencyconditions with (i) the polypeptide coding sequence of SEQ ID NO:49 orthe cDNA sequence thereof, or (ii) the full-length complement of (i).See Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 2dedition, Cold Spring Harbor, N.Y., which is hereby incorporated byreference in its entirety.

A genomic DNA or cDNA library prepared from such other host strains maybe screened for DNA that hybridizes with the probes described herein andencodes a polypeptide having HLE activity or capable of specificallycleaving a non-glycosidic ether bond between a lignin or a derivativethereof and a saccharide. Genomic or other DNA from such other strainsmay be separated by agarose or polyacrylamide gel electrophoresis, orother separation techniques. DNA from the libraries or the separated DNAmay be transferred to and immobilized on nitrocellulose or othersuitable carrier material. In order to identify a clone or DNA that ishomologous with SEQ ID NO:1 or SEQ ID NO:3, or a subsequence thereof,the carrier material is used in a Southern blot.

Similarly, to identify a clone or DNA that is homologous with SEQ IDNO:49, or a subsequence thereof, the carrier material is used in aSouthern blot.

For purposes of the present invention, hybridization indicates that thepolynucleotide hybridizes to a labeled nucleic acid probe correspondingto SEQ ID NO:1 or the cDNA sequence thereof, or SEQ ID NO:3 or the cDNAsequence thereof; the mature polypeptide coding sequence of SEQ ID NO:1,or the mature polypeptide coding sequence of SEQ ID NO:3; thefull-length complement thereof; or a subsequence thereof; under very lowto very high stringency conditions. Molecules to which the nucleic acidprobe hybridizes under these conditions can be detected using, forexample, X-ray film or any other detection means known in the art.

Hybridization also includes that the polynucleotide hybridizes to alabeled nucleic acid probe corresponding to SEQ ID NO:49 or the cDNAsequence thereof; the mature polypeptide coding sequence of SEQ IDNO:49; the full-length complement thereof; or a subsequence thereof;under very low to very high stringency conditions. Molecules to whichthe nucleic acid probe hybridizes under these conditions can be detectedusing, for example, X-ray film or any other detection means known in theart.

In another embodiment, the present invention relates to isolatedpolypeptides having HLE activity encoded by polynucleotides having asequence identity to the polypeptide coding sequence of SEQ ID NO: 1 orthe cDNA sequence thereof, or the mature polypeptide coding sequence ofSEQ ID NO: 3 or the cDNA sequence thereof, of at least about 70%, atleast about 75%, at least about 80%, at least about 81%, at least about82%, at least about 83%, at least about 84%, at least about 85%, atleast about 86%, at least about 87%, at least about 88%, at least about89%, at least about 90%, at least about 91%, at least about 92%, atleast about 93%, at least about 94%, at least about 95%, at least about96%, at least about 97%, at least about 98%, at least about 99%, or100%.

In another embodiment, the present invention relates to isolatedpolypeptides having HLE activity encoded by polynucleotides having asequence identity to the polypeptide coding sequence of SEQ ID NO: 49 orthe cDNA sequence thereof, of at least about 70%, at least about 75%, atleast about 80%, at least about 81%, at least about 82%, at least about83%, at least about 84%, at least about 85%, at least about 86%, atleast about 87%, at least about 88%, at least about 89%, at least about90%, at least about 91%, at least about 92%, at least about 93%, atleast about 94%, at least about 95%, at least about 96%, at least about97%, at least about 98%, at least about 99%, or 100%.

In another embodiment, the present invention relates to variants of themature polypeptide of SEQ ID NO: 2 or SEQ ID NO: 4, comprising asubstitution, deletion, and/or insertion at one or more (e.g., several)positions. In an embodiment, the number of amino acid substitutions,deletions and/or insertions introduced into the mature polypeptide ofSEQ ID NO: 2 or SEQ ID NO: 4 is not more than 10, e.g., 1, 2, 3, 4, 5,6, 7, 8, 9, or 10. The amino acid changes may be of a minor nature, thatis conservative amino acid substitutions or insertions that do notsignificantly affect the folding and/or activity of the protein; smalldeletions, typically of 1-30 amino acids; small amino- orcarboxyl-terminal extensions, such as an amino-terminal methionineresidue; a small linker peptide of up to 20-25 residues; or a smallextension that facilitates purification by changing net charge oranother function, such as a poly-histidine tract, an antigenic epitopeor a binding domain.

In another embodiment, the present invention relates to variants of themature polypeptide of SEQ ID NO: 50, comprising a substitution,deletion, and/or insertion at one or more (e.g., several) positions. Inan embodiment, the number of amino acid substitutions, deletions and/orinsertions introduced into the mature polypeptide of SEQ ID NO: 50 isnot more than 10, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. The amino acidchanges may be of a minor nature, that is conservative amino acidsubstitutions or insertions that do not significantly affect the foldingand/or activity of the protein; small deletions, typically of 1-30 aminoacids; small amino- or carboxyl-terminal extensions, such as anamino-terminal methionine residue; a small linker peptide of up to 20-25residues; or a small extension that facilitates purification by changingnet charge or another function, such as a poly-histidine tract, anantigenic epitope or a binding domain.

The nucleic acid constructs may include an expression vector thatincludes a cDNA molecule encoding the polypeptide having HLE activityoperably linked to one or more control sequences that direct theexpression of the coding sequence in a suitable transformed host cellunder conditions compatible with the control sequences.

Also included herein are methods of producing heterologous polypeptidesof the present invention, comprising: (a) cultivating a recombinant ortransformed host cell under conditions conducive for production of thepolypeptide; and (b) recovering the heterologous polypeptide. Theheterologous polypeptide may be defined herein as a polypeptide which isnot native to the host cell, a native protein in which modificationshave been made to alter the native sequence or a native protein whoseexpression is quantitatively altered as a result of a manipulation ofthe host cell by recombinant DNA techniques.

The recombinant or transformed host cells are cultivated in a nutrientmedium suitable for production of the heterologous polypeptides usingmethods known in the art. For example, the cells may be cultivated byshake flask cultivation, or small-scale or large-scale fermentation(including continuous, batch, fed-batch, or solid state fermentations)in laboratory or industrial fermentors, in a suitable medium and underconditions allowing the polypeptide to be expressed and/or isolated. Thecultivation takes place in a suitable nutrient medium comprising carbonand nitrogen sources and inorganic salts, using procedures known in theart. Suitable media are available from commercial suppliers or may beprepared according to published compositions (e.g., in catalogues of theAmerican Type Culture Collection). If the heterologous polypeptide issecreted into the nutrient medium, the heterologous polypeptide can berecovered directly from the medium. If the heterologous polypeptide isnot secreted, it may be recovered from cell lysates.

The heterologous polypeptide may be detected using methods known in theart that are specific for the heterologous polypeptides. These detectionmethods include, but are not limited to, use of specific antibodies,formation of an enzyme product, or disappearance of an enzyme substrate.For example, an enzyme assay may be used to determine the activity ofthe polypeptide. The HLEs can be monitored or measured by gel permeationhigh pressure liquid chromatography (HPLC) or SDS-polyacrylamide gelelectrophoresis (SDS-PAGE).

The heterologous polypeptide may be recovered using methods known in theart. For example, the heterologous polypeptide may be recovered from thenutrient medium by conventional procedures including, but not limitedto, collection, centrifugation, filtration, extraction, spray-drying,evaporation, or precipitation and/or a combination thereof. In oneaspect, the whole fermentation broth is recovered.

The heterologous polypeptide may be purified by a variety of proceduresknown in the art including, but not limited to, chromatography (e.g.,ion exchange, affinity, hydrophobic, chromatofocusing, and sizeexclusion), electrophoretic procedures (e.g., preparative isoelectricfocusing), differential solubility (e.g., ammonium sulfateprecipitation), SDS-PAGE, or extraction (see, e.g., Janson, J.-C. andRyden, L. (eds), Protein Purification: Principles, High ResolutionMethods and Applications, VCH Publishers, Inc., NY, 1989, which ishereby incorporated by reference in its entirety) to obtainsubstantially pure polypeptides.

According to the embodiments of the inventions, bioprospecting anddeveloping polypeptides or enzymes that specifically cleavenon-glycosidic ether bonds between a lignin or a derivative thereof anda polysaccharide (e.g., mannan or xylan) using a lignin fluorogenicanalog (e.g., 4-methylumbelliferone derivative or 4-MU) based onhemicellulose are described in more detail in the sections that follow.Derivatization of a polysaccharide (e.g., galactomannan such as locustbean gum) can be carried out as follows: (a) solubilizing apolysaccharide in an aqueous solvent (e.g., water); (b) inducing orprecipitating the solubilized polysaccharide with at least two volumes(50-100%) of dimethylformamide (DMF; an organic solvent) to form a gel,(c) optionally, washing the gel with additional dimethylformamide:water(2:1) mixture; (d) mixing the gel with a few drops of DMF, a 10-foldmolar excess of a fluorescent phenylacetate derivative (wherein thereactant is a 4-methylumbelliferone derivative (e.g.,4-methylumbelliferyl acetate or a 4-methylumbelliferone esterified witha good leaving group) and a molar excess of a catalyst (e.g.,N-bromosuccinamide or NBS); (e) incubating the gel mixture at atemperature ranging from at least about 25° C. to about 90° C. for about1 hour to overnight (from about 60° C. to about 90° C.) to form aderivatized polysaccharide mixture; and (f) washing the derivatizedpolysaccharide mixture with a solvent (e.g., DMF and/or 95% ethanol;acetone to remove any free phenylacetate derivative. Examples ofsaccharides that contain 6-carbon sugars in a pyranose configuration mayinclude, but are not limited to, mannose, mannan, galactomannan (e.g.,locust bean gum), cellulose, galactan or hemicellulose. An example of apolysaccharide containing a 5 carbon sugar in the pyranose configurationis xylan. The derivatized polysaccharide, 4-methylumbellyferyl(4-MU)-locust bean gum (4MU-LBG)) can be measured or monitored by gelpermeation HPLC. See Sun, X.-F. et al., Acetylation of sugarcane bagassehemicelluloses under mild reaction conditions by using NBS as acatalyst, J. Appl. Polym. Sci., 95(1):53-61 (2004), which is herebyincorporated by reference in its entirety.

In another embodiment, an exemplary method for preparing a derivatizedmixture of polysaccharide composed of 5-carbon sugars (e.g.4-methylumbelliferyl (4-MU)-xylan) may encompass the following: (a)hydrolyzing a xylan polymer under controlled conditions to form aresidue; (b) refluxing the xylan polymer residue in dry methanol toyield methylated free glycosidic hydroxyl groups; (c) benzylatingnon-glycosidic hydroxyl residues of the xylan polymer with benzylbromide using DMSO as a solvent and crown ether (benzo-18-crown-6) as acatalyst; (d) displacing the benzyl groups with triflic(trifluoromethanesulfonic) anhydride to convert the hydroxyl groups ofthe xylan polymer to suitable leaving groups; (e) brominating withtetra-N-butylammonium bromide to displace the leaving groups with halideanions; and (f) reacting the brominated xylan polymer with a fluorescentphenylacetate derivative (4-MU) to form a derivatized xylan (4MU-xylan).The polysaccharide, as used herein, is composed primarily of 5-carbonsugars in a pyranose configuration (e.g., xylose or xylan) derivatizedon C2 or C3. The derivatized polysaccharide, 4-methylumbellyferyl(4-MU)-xylan can also be measured or monitored by gel permeation HPLC,infrared spectroscopy or nuclear magnetic resonance (NMR).

EXAMPLES

Derivatizing Non-Glycosidic Carbons of Saccharides with Phenyl EtherDerivatives

Most derivatives of saccharides (monosaccharides, disaccharides,oligosaccharides, galactomannan, cellulose, galactan or polysaccharidesthat incorporate 6-carbon sugars in the pyranose configuration orpolysaccharides that incorporate sugars such as heptopyranoses or5-carbon sugars in the pyranose configuration and that have primaryhydroxyls) are prepared by completely solubilizing the polysaccharide inan aqueous or organic solvent. Polymeric material, especially naturalpolysaccharides, are diverse in molecular weight and their solubilityvaries with molecular weight. In this method, a solid-phase method wasdeveloped based on an unusual modification of a method in a previousstudy that devised a catalytic acetylation of xylan (a pentopyranose) ina semi-dry gel phase. See Sun, X. F. et al., 2004, supra.

Briefly, the water-solubilized polysaccharide was precipitated with twovolumes (50-100% ratio) of dimethylformamide (DMF) to induce gelformation. The excess DMF/water solution was removed by filtration or bysimply removing the gel to a fresh tube with forceps or some othersimple method known to one skilled in the art. The gel may or may not bewashed with additional DMF or DMF:water (2:1) mixture, and it may or maynot be treated to increase its surface area to volume ratio, for exampleby mincing it manually with a razor blade. The gel was combined with areactant composed of a phenyl acetate derivative (e.g.,4-methylumbelliferyl (4-MU) acetate, 4-MU, 4-MU-phosphate or4-methylumbelliferone esterified with another leaving group) inapproximately 10-fold molar excess to the number of residues ofhexopyranose in the polysaccharide. N-bromosuccinamide (NBS) as acatalyst was added in very large molar excess. The mixture was incubatedat elevated temperature (from at least about 37° C. to about 80° C.) foranywhere from one hour to overnight. During the incubation, the reactionmixture turned yellow within 10 minutes, then orange, and finally brown.The time course of the color change was dependent on the concentrationof NBS, the nature and concentration of the phenyl acetate derivativeand the incubation temperature.

Example 1

A 1% solution of locust bean gum (LBG) from Ceratonia siligua seeds(Sigma G0753) was prepared in water and heated gently to dissolve asmuch as possible. Ten ml of LBG suspension was combined with 5 ml ofDMF. The mixture was vortexed and used immediately or stored at 4° C.overnight or longer. Afterwards, the precipitate was filtered on WhatmanP9 or another coarse grade of filter paper in a Buchner funnel on anaspirator. The swollen gel was washed on the filter paper withadditional DMF, but not dried further. Approximately half of the swollengel (about 0.5 g of LBG) was transferred to an amber glass vial. Six mgof 4MU-Ac (4-methylumbelliferyl acetate, Sigma M0883) dissolved in about50 microliters of DMF was added. About 0.1 g of dry NBS was added. Themixture was vortexed and incubated in the dark overnight at 65° C. in aheat block. The vials were periodically removed, mixed by vortexing andreturned to the heat block. The next morning, 20 ml of 95% ethanol wasadded and vortexed. The mixture was filtered on a Buchner funnel andwashed several times with ethanol and finally with acetone. Washing canbe done in a number of solvents that do not dissolve the polysaccharides(e.g., ethanol, acetone or DMF). The resulting brownish powder was driedat room temperature in the dark until the acetone smell was gone and thematerial appeared dry.

Example 2

LBG was first partially hydrolyzed and filtered to decrease theviscosity and make the polymer size distribution narrower. For partialhydrolysis, 5 g LBG was gradually added to 400 ml distilled water,stirred for 2 hours at room temperature and refrigerated overnight orlonger. The suspension was re-equilibrated to room temperature withstirring. The temperature was increased to 70° C. with stirring and heldat 70° C. for 30 minutes while stirring. The preparation was allowed tocool to room temperature to form a viscous suspension. The pH wasadjusted to 3 with acetic acid and the beaker was covered with aluminumfoil. The preparation was autoclaved for 100 minutes at 130° C. tohydrolyze some of the glycosidic bonds in the polymer and reduce theviscosity. The autoclaved mixture was neutralized with 3M NaOH to a pHof approximately 5.5, allowed to cool to room temperature, and the pHwas readjusted to 7.0. Large insoluble precipitates were removed byfiltration through a Whatman P8 filter in a Buchner funnel attached to awater aspirator. The filtrate volume was reduced either by evaporationor by partial lyophilization, and the final volume was adjusted to 100ml. Three volumes of 95% ethanol were added to the remaining solutionwith stirring. The solution was allowed to stir for several hours toensure complete precipitation, and the precipitate was collected on a P8filter as before, washed twice with 95% ethanol on the filter, andlyophilized. The dried LBG could be stored indefinitely at 4° C.

Prior to reaction with 4MU-acetate (4MU-Ac), the hydrolyzed LBG wasre-equilibrated with DMF. 0.5 g of LBG prepared as above was combinedwith 8 ml of 100% DMF in a vial. The vial was put into a 65° C. heatblock for 4 days (3 nights) with occasional mixing. 50 mg of 4MU-Ac wasadded and the vial was vortexed. 0.5 g NBS (N-bromosuccinamide, SigmaB81255) was immediately added and the vial was mixed again by vortexing.The lightly capped vial was incubated at 65° C. overnight, with frequentvortexing over the first few hours. Following the reaction, the reactionmix was transferred to a beaker, and 100 ml DMF was used to rinse thevial. The rinse was subsequently added to the beaker. The mixture wasstirred for 10 minutes at room temperature and the precipitate wascollected on a P8 filter as before. The precipitate was removed to abeaker and washed with 100 ml of 95% ethanol, and refiltered. Thisprocess was repeated twice, for a total of 3 washes in 95% ethanol. Thefinal precipitate was lyophilized and the dry powder stored at −20° C.

Proof of Derivatization from Example 1:

1. Lack of underivatized 4MU. Free 4MU is highly fluorescent at alkalinepH. Ether-derivatized 4MU is not. See Robinson, D., The fluorometricdetermination of β-glucosidase: its occurrence in the tissues ofanimals, including insects, Biochem. J., 63:39 (1956), which is herebyincorporated by reference in its entirety. Consequently, an increase influorescence when a solution's pH is adjusted to alkaline can beindicative of the presence of free 4MU. When 4MU-LBG was dissolved in abalanced salt solution at pH 5.5 and adjusted to pH 10 with 0.1M boratebuffer, the fluorescence was very similar to its fluorescence at pH 5.5.

2. Lack of glycosidic 4MU. The locust bean gum used in this exampleconsisted of a polymannose backbone with single galactose residues onapproximately every 5th mannose residue, linked to mannose residues by a1-6 β-bond. Consequently, the vast majority of mannose residues did nothave a free C1 hydroxyl, since those groups were already part of aglycosidic bond. Similarly, virtually all of the galactose residues didnot have a free C1 hydroxyl as that group was part of the branchingC1→C6 glycosidic bond. To determine whether any 4MU reacted with a C1hydroxyl group, 4MU-LBG was digested with a commercial hemicellulase.Commercial hemicellulase is a mixture of enzymes containing, amongothers, mannanase, xylanase, β-galactosidase and β-mannosidase. Themixture digested hemicelluloses including mannans down to a mixture ofmono-, di- and tri-saccharides. Therefore, the mixed hemicellulase mayliberate any 4MU bound through a glycosidic bond, whether to mannose orto galactose. The fluorescence intensity at pH 10 was not increased byincubation with commercial hemicellulose.

To extend these results and rule out any incorporation of 4MU viaα-glycosidic bonds, the derivatized LBG was digested with commerciallyavailable α- and β-mannosidases, as well as with commercially availableα- and β-galactosidases. As a positive control, the commerciallyavailable 4MU-derivatized α- and β-mannose and α- and β-galactose wereincluded as internal controls in some digestions. Table I shows theresults. No enzyme liberated any fluorescence from 4MU-LBG. The presenceof 4MU-LBG did not affect the hydrolysis of any of the commercialsubstrate, indicating a lack of competition of 4MU-LBG for any of theenzymes.

TABLE I Effect of Commercial Glycoside Hydrolases on 4 MU-derivatizedLBG. Cognate 4 MU-LBG + cognate Negative 4 MU- 4 MU-pyranoside EnzymeControl pyranoside 4MU-LBG (internal control) α-galactosidase 94 50,97194 49,471 β-galactosidase 92 127,112 100 116,781 α-mannosidase 114115,450 94 97,181 β-mannosidase 127 120,332 114 113,240

In addition, treatment of polysaccharides with 4% sulfuric acid at 250°C. for 60 minutes is a standard method to hydrolyze glycosidic bonds inpolysaccharides. See Sluiter, A. et al., Determination of StructuralCarbohydrates and Lignin in Biomass, Laboratory Analytical Procedure(LAP) (Version Jul. 8, 2011), Technical Report NREL/TP-510-42618 (2001),which is hereby incorporated by reference in its entirety.

Non-glycosidic ether bonds are not hydrolyzed by this procedure, howeverester bonds are hydrolyzed. If 4MU is derivatized via a glycosidic etherbond or via an ester bond, the acid hydrolysis treatment should liberatefree 4MU. Putatively labeled locust bean gum was hydrolyzed in 4%sulfuric acid at 250° C. for 60 minutes. The treated samples wereadjusted to pH 10 with concentrated borate buffer and their fluorescencedetermined as above. No increase in fluorescence was found.

3. Presence of 4MU on high molecular weight material. 4MU-LBG wasanalyzed by GP-HPLC. The system had detectors for both refractive indexchanges and for OD₂₅₄ absorbance. A peak of UV absorbance was detectedin the column flow and its retention volume was consistent withhemicellulose. FIG. 1 shows column profile at OD₂₅₄ for both4MU-derivatized LBG and LBG that had been treated identically exceptthat no 4MU acetate was added to the reaction tube. Locust bean gum thathad been putatively derivatized with 4MU had significant absorbance atOD₂₅₄ at the RID peak (red line in FIG. 1) while underivatized locustbean gum had no absorbance at 254 nm at the peak of refractive index(blue line in FIG. 1).

Bioprospecting for Microorganisms that can Release 4MU fromHemicellulose Derivatized on Non-Glycosidic Carbons with 4MU Via EtherBonds.

General Method

Soil samples were taken beneath sites of wood decay in Maine forests andshaken with a balanced salt solution to suspend soil microbes. Soil anddebris were allowed to settle for 10-30 minutes, and the supernatant wasdecanted. Microorganisms in the supernatant were used directly orpelleted by centrifugation and resuspended in 1/20 of the originalvolume of sterile water or balanced salt solution. See Connell, L. etal., Distribution and abundance of fungi in the soils of Taylor Valley,Antarctica. Soil Biol. Biochem., 38:3083-3094 (2006), which is herebyincorporated by reference in its entirety. The suspended microorganismswere inoculated into a sterile minimal medium containing hemicellulosemodified via a non-glycosidic ether bond to a lignin or a derivativethereof or a lignin mimic as the major carbon source (for example,Highley's Balanced Salt Solution (HBSS=2 g KH₂PO₄, 0.8 g MgSO₄.7H₂O, 0.1g CaCl₂.2H₂O per liter containing 2 ml per L of trace element mix fromAmerican Type Culture Collection (Manassas, Va. 20110), +0.1-1.0% LBG,either benzylated (see below), 4-MU-derivatized, native or a mix).Alternatively or in addition, a modification of a medium formulationthat has been used to isolate anaerobic bacteria may be used, modifiedso as to include, as the major carbon source, hemicellulose derivatizedvia a non-glycosidic ether bond to a lignin monomer, lignin mimic orderivative thereof. See Warnick, T. A. et al., Clostridiumphytofermentans sp. nov., a cellulolytic mesophile from forest soil,Int. J. Syst. Evol. Microbiol. 52:1155-1160 (2002), which is herebyincorporated by reference in its entirety.

Because previous researchers have found that ether bonds are more easilybroken at more extreme pH values (see Alexander, M., Biodegradation:problems of molecular recalcitrance and microbial fallibility, Adv.Appl. Microbiol. 7:35-80 (1965), which is hereby incorporated byreference in its entirety), the pH of the media may be adjusted fromabout pH 4.5 to about pH 5. Each formulation of medium may be incubatedas appropriate either aerobically or anaerobically under an oxygen-pooratmosphere (e.g., nitrogen atmosphere). The cultures may be incubated inthe dark at room temperature and can be checked at least every week forthe development of the products of cleavage of the non-glycosidic etherbond between hemicellulose and lignin or a derivative thereof or alignin mimic. Fluorescence was measured in a Biotek Synergy 2 platereader (Biotek Instruments, Inc., Winooski, Vt. 05404) equipped with anexcitation filter for 340-380 nm and an emission filter for 440-480 nm.

When cleavage products were detected, culture samples were plated ontoagar plates made in the same growth medium as used in the previous step.Once colonies developed, the plates were treated so as to reveal thecleavage product and examined to determine which colonies were cleavingnon-glycosidic bonds between hemicellulose and lignin or a derivativethereof or a lignin mimic. Those colonies were picked and replated until100% of the colonies on the plate were positive and all the colonies hadthe same appearance under a microscope.

Alternatively, when the extent of cleavage on the non-glycosidic etherbonds to hemicellulose began to decrease in the initial flasks, analiquot of the culture was diluted 10-fold into fresh medium as before.The extent of cleavage was monitored at least once per week. Typically,fluorescence once again began to rise and eventually to decrease. Whenit began to decrease, this enrichment process was repeated. On the thirdor fourth enrichment, a sample of the culture was plated directly ontoagar plates made with the growth medium used previously. Individualcolonies were examined under an inverted microscope for morphology andgrowth characteristics. Well separated individual colonies were pickedand streaked onto fresh plates. The process was continued until at least3-4 replicates of each morphology and growth type were obtained. Thisprocess is well-known to those skilled in the art.

Because the assay for cleavage of non-glycosidic ether bonds betweenhemicellulose and lignin or a derivative thereof or a lignin mimicdepends on fluorescence, the possibility of endogenous synthesis offluorescent molecules by the isolated microbial strains was examined.Most simply, strains of microbes isolated as described were grown inliquid suspension cultures in the isolation medium. Once fluorescencehad developed, spent medium was sampled and the cells were removedeither by centrifugation or filtration through a 0.22μ syringe filter.Fresh substrate for the detection of cleavage of non-glycosidic etherbonds between hemicellulose and lignin or a derivative thereof or alignin mimic was added to the cell-free medium and the development offluorescence was monitored over time. A lack of continued increase influorescence may be due to the absence of synthesis of autofluorescentcompounds, or to the absence of metabolic recharging of required energyor redox co-factors, or to the loss of enzyme activity that was anchoredon cell membranes. However, increased fluorescence was interpreted asevidence for the presence of a soluble enzyme activity that may notrequire an energy or redox cofactor.

In all embodiments, those strains whose enzyme activity was not freelyavailable in the culture medium may be characterized to determinewhether the enzyme activity was free in the culture supernatant (solubleor tethered) and whether it used an energy cofactor (energycofactor-dependent or -independent). The strategy can be summarized in adecision tree as shown in FIG. 2.

Experiment 1—The most effective cellulase systems are not freelysecreted but are found tethered to the surface of anaerobic bacteria ina macromolecular structure called a cellulosome. See Chang, M. C Y.,Harnessing energy from plant biomass, Curr. Op. Chem. Biol. 11:677-684(2007), which is hereby incorporated by reference in its entirety. Inaddition, another enzyme targeting C₂ of sugars (2-pyranose oxidase) istethered to the surface of basidiomycete fungi. See Danneel, H. J. etal., Purification and characterization of a pyranose oxidase from thebasidiomycete Peniophora gigantea and chemical analyses of its reactionproducts, Eur. J. Biochem. 214(3):795-802 (1993) and Prongjit, M. etal., Kinetic mechanism of pyranose 2-oxidase from Trametes multicolor,Biochem. 48(19):4170-4180 (2009), both of which are hereby incorporatedby reference in their entirety. The question of tethered versus freeenzyme activity can be answered by separating cells from culturesupernatant and challenging the resuspended cell pellet with freshsubstrate. Cells that have been heat treated, protease treated, orfixative treated serve as controls. An increase in the concentration ofcleavage products relative to controls can be taken as evidence for thepresence of tethered enzyme on the washed cells, as long as the timecourse of generation of cleavage products from substrate is relativelyfast. In the event that the activity is slow, the possibility of de novosynthesis of soluble enzyme activity may become significant. Aninhibitor of protein synthesis (i.e., chloramphenicol or blasticidin S)may be added to ensure that the cells are not synthesizing and exportingan enzyme activity. Alternatively, the supernatant can be removed fromthe cells and given a second, separate incubation to see if a solubleactivity is present. The latter does not make an a priori assumptionthat activity against the substrate is due to a protein.

Experiment 2—To test whether any putative enzyme activity uses acofactor, the cell fraction from Experiment 1 may be treated with aninhibitor of cellular respiration. The final choice of inhibitor dependson the characteristics of the cells (prokaryotic or eukaryotic, aerobicor anaerobic, etc.). For example, the carboxamide antibiotics, directedagainst succinate dehydrogenase, may affect both bacteria and fungi.Following treatment with the inhibitor, cells can be incubated for 30minutes to use up endogenous stores of cofactor, and fresh substrate maybe added to ensure that the activity is not substrate-limited. Theconcentration of the cleavage products may be followed over time. If theconcentration of cleavage products continues to increase, it can betentatively conclude that an energy cofactor was not used for activity.In this situation, because cellular energy metabolism is used forprotein synthesis, de novo synthesis of enzyme activity may not be aconcern.

Experiment 3—To confirm that that a soluble activity uses an energycofactor, particularly when there is no observable activity inExperiments 1-2, culture supernatant can be combined with either controlcells or cells that have been washed and then treated with an inhibitorof cellular respiration. If activity can be restored by exposinginactive culture supernatant to cells that can make energy cofactors,cleavage activity may be presumed to be soluble with a need for anenergy cofactor. Incubating culture supernatant with untreated washedcells may serve as a positive control. As before, if the time course ofcleavage is slow, a protein synthesis inhibitor may be incorporated intothe experiments.

Example 3

Soil samples were collected from beneath thoroughly rotted softwood andplaced in sterile containers and were returned to the laboratory asquickly as possible. Fifty grams of soil was combined with 100 ml ofsterile Highley's Balanced Salt Solution. The mixture was shaken gentlyat room temperature for 1 hour. Soil was permitted to settle for 10-30minutes and the supernatant was decanted. Microorganisms in thesupernatant were pelleted by centrifugation and resuspended in 1/20 ofthe original volume of sterile water. See Connell, L. et al., 2006,supra. One ml of the soil supernatant was inoculated into 10 ml ofsterile culture medium in sterile 25 ml Erlenmeyer flasks. The culturemedium consisted of HBSS supplemented with 2 g/L of ammonium nitrate and0.3% 4MU-LBG. One flask was inoculated with sterile HBSS to serve as anon-inoculated control. The cultures were incubated at room temperaturein the dark. At weekly intervals, 0.1 ml was withdrawn from each cultureinto a black 96 well plate. About 0.1 ml of 0.3M sodium borate, pH9.8-10 added to intensify the fluorescence. Fluorescence was measured ina Biotek Synergy 2 plate reader (Biotek Instruments, Inc., Winooski, Vt.05404) equipped with an excitation filter for 340-380 nm and an emissionfilter for 440-480 nm. The fluorescence was compared to that measuredfor non-inoculated control medium.

Over the course of 3-6 weeks, fluorescence developed in some of theflasks. When fluorescence began to decline, a sample of the culture wasspread onto sterile petri dishes containing fresh medium solidified with0.16% agar. When colonies developed, plates were examined using ahand-held UV light to note which colonies were fluorescent.Alternatively, a PVDF (polyvinylidene difluoride) membrane was overlaidon the plates. 4MU binds strongly to PVDF. The membrane was removed andrinsed pH 8-10 to intensify 4MU fluorescence. Fluorescence spots on themembrane were correlated to colonies on the plate. Fluorescent colonieswere picked and re-spread on new plates. The process was repeated untilpure cultures were obtained.

Putative positive colonies were grown in suspension cultures in themedium originally used for isolation (HBSS+ammonium nitrate+4MU-LBG).Spent culture medium was passed through a sterile filter to remove cellsand fresh 4MU-LBG was added. If additional fluorescence developed, theculture was considered to be producing and exporting an enzyme thatliberated 4MU from the substrate. If not, it was concluded that either:

a) the cells in the culture were producing an autofluorescent molecule,and once the cells were removed, no additional fluors could besynthesized; b) the enzyme was located on the cell surface(cellulosomal); or c) the enzyme required the on-going presence of anenergy cofactor that required cellular metabolism to produce. Two ormore of these explanations could be present simultaneously.

Autofluorescence was eliminated in two different ways. First, it waspresumed that any fluorescent molecules synthesized by the cells wouldhave different spectral characteristics than 4MU. Consequently, emissionand excitation wavelength scans of spent medium were performed.Interference by contaminants in the LBG preparation rendered these testsinconclusive in some cases.

Alternatively, to eliminate false positives due to autofluorescence, thecells were grown in the presence of an enzyme substrate that would notyield fluorescence. A benzylated derivative of LBG was synthesized basedon a method developed by Lu. See Lu, Y., Benzyl konjac glucomannan,Polymer 43:3979-3986, (2002), which is hereby incorporated by referencein its entirety.

Synthesis of Benzylated Locust Bean Gum (LBG), 10 Gram Scale

The synthesis of benzylated LBG, as illustrated in FIG. 3, was performedas follows:

(1) Locust bean gum (10 gm) was dissolved in 400 ml pure water withoverhead stirring at 200 rpm. Tetrabutyl ammonium iodide (250 mg) wasadded to act as a phase transfer catalyst. The temperature wasmaintained at 40-45° C. with stirring for 1 hour.

(2) 100 grams of 40% (w/v) aqueous NaOH was added dropwise to thereaction mixture. The temperature was maintained at 40-45° C. withstirring for an additional hour.

(3) Approximately 25 ml benzyl chloride was added dropwise and themixture was stirred overnight at 90° C.

(4) The reaction was cooled to room temperature and neutralized withacetic acid, leading to some precipitation of the benzylated material.

(5) Complete precipitation was accomplished by the addition of 300 ml ofethanol dropwise with stirring to a final concentration of ˜60% (v/v).

(6) The benzylated LBG was filtered through P8 filter paper on a wateraspirator. The precipitate was washed in a beaker with ˜300 ml of 95%ethanol, added dropwise while stirring, and then stirred at roomtemperature for 1 hour.

(7) The washed precipitate was recovered by filtration as before andwashed with 95% ethanol as before an additional two times.

(8) The final precipitate was dried on the aspirator for 30 minutes andthen dried in an oven at ˜70-75° C. overnight to yield a white andfluffy powder.

(9) The final product was stored at −20° C.

An aliquot of a putatively positive microbial strain was inoculated intoeach of two parallel suspension cultures. One flask's sole carbon sourcewas LBG mixed with benzylated LBG. The second flask's sole carbon sourcewas LBG mixed with 4MU-LBG. The total concentration of LBG+modified LBGwas 0.3%, but the concentration of the modified LBG, whether benzylatedor 4-methylumbelliferone-derivatized, ranged from 0.05% to 0.3%. Theconcentration of the modified LBG was kept consistent within a singleexperiment. If fluorescence that developed in the culture containing4MU-LBG was significantly higher than the fluorescence developed in theculture containing benzylated LBG, it was presumed that the differencewas due to the liberation of 4MU by enzymatic activity.

Microbial strains that synthesized a putative soluble and co-factorindependent activity were identified by both fatty acid methyl esteranalysis and by rDNA sequencing.

Example 4

Soil microorganisms were collected and incubated as in Example 3.However, when fluorescence of the primary cultures began to decline, 1ml of the culture was inoculated into 10 ml of fresh medium. The newcultures were incubated as before, at room temperature in the darkwithout shaking, and the culture supernatant monitored as before bysterilely withdrawing an aliquot of culture supernatant, adjusting itspH to about 9.8, and comparing fluorescence to that of a non-inoculatedcontrol culture. This enrichment process was repeated 3-4 times.

After the 3rd and 4th enrichments, samples of the culture medium wereplated onto HBSS+ammonium nitrate+0.3% LBG+0.16% agar. Plates wereincubated at room temperature and examined daily to determine thenumbers and morphologies of the colonies present. Several colonies foreach morphology found were individually picked into fresh liquid mediumand resuspended. Each enriched culture yielded about 15-30 colony types.

Each individual colony suspension was replated individually. Each platewas examined for purity, and colonies re-picked. The process wasrepeated until all the different isolates were pure. Frozen stocks weremade for each colony.

Each individual freezer stock was inoculated into a suspension culturein which the growth medium's sole carbon source was LBG mixed with4MU-LBG. However, in some cases, additional sources of organic nitrogenlike yeast extract or malt extract were also added, adding a smallamount of additional carbon source. The total concentration ofLBG+modified LBG was 0.3%, but the concentration of the 4MU-LBG rangedfrom 0.05% to 0.3%. The concentration of the modified LBG was keptconsistent within a single experiment. The culture supernatant wasmonitored at least twice weekly for the development of fluorescence aspreviously described. Cultures that developed fluorescence were furtherscreened for autofluorescence by one of the methods as described inExample 3.

For both Examples 3 and 4, microbial strains that synthesized a putativesoluble and co-factor independent activity were selected and identifiedby both fatty acid methyl ester analysis and by rDNA sequencing. In somecases, fatty acid methyl ester analysis was not performed and twoindependent rDNA sequence analyses were performed.

A novel bacterial strain, B603, was isolated. Fatty acid methyl esteranalysis indicated an excellent (similarity index/standard deviation of0.396) match to Xanthomonas axonopodis vasculorum. rDNA analysisindicated a 100% match to Luteibactor rhizovicina (strains 1176, 1196,and 1199) as well as some Dyella species. See Stark, M. et al.,MLTreeMap—accurate Maximum Likelihood placement of environmental DNAsequences into taxonomic and functional reference phylogenies, BMCGenomics 11:461 (2010), which is hereby incorporated by reference in itsentirety. It is believed that B603 is closely related to Luteibactorrhizovicinus. A scanning electron micrograph is shown in FIG. 4.

Example 5

A novel microorganism that can release 4MU from xylan derivatized onnon-glycosidic carbons with 4MU via phenolic ether bonds.

A fluorogenic substrate analogous to 4MU-LBG was constructed based on acommercial xylan prepared from birchwood. Unlike mannan, xylan is basedon a 5 carbon sugar, xylose. Derivatization of chains of hexoses in thepyranose configuration can take place at a primary hydroxyl. However, inxylan, the chains of 5 carbon sugars in the pyranose configuration lackprimary hydroxyl groups for derivatization. Derivatization, therefore,may be at a secondary hydroxyl, a more difficult challengeenergetically. In addition, the carbons with secondary hydroxyls, C₂ andC₃ of xylose, are anomeric. Depending on the mechanisms ofderivatization steps, the stereoconfirmation of the xylose residue maybe altered, changing the nature of the sugar residues.

A more conventional method of derivatization was devised to avoidchanging the stereoconfirmation and to overcome the lower reactivity ofthe secondary hydroxyl groups.

Step 1: Controlled hydrolysis of xylan. The length of the xylan polymeraffects solubility in the solvents needed for the synthesis. Acontrolled hydrolysis at pH 3 was used to increase xylan's solubility inDMSO and pyridine. Approximately 10 g of xylan from birchwood (SigmaCat. No. X0502) was placed into a beaker containing 250 mL water(adjusted to pH=3 using glacial acetic acid). The solution wasautoclaved for 100 min at 130° C. After autoclaving, the material wasneutralized with 3-10 M sodium hydroxide and lyophilized overnight. Thexylan was then washed by stirring it with ethanol for 60 minutesfollowed by filtration through P8 paper at room temperature. The washwas repeated. The residue was dried under vacuum and then lyophilized.

Step 2: Glycosylation. Free glycosidic hydroxyls were methylated byrefluxing the xylan in dry methanol. The xylan was then dissolved in 100ml hot water, and precipitated by adding 200 ml 95% ethanol (dropwise),and filtered using P8 filter paper. The precipitate was oven driedovernight at 70° C. Xylan was then quickly transferred to a 200 ml RB(round bottom) flask pre-flushed with N₂ and fitted with an air coolingcondenser. A solution of 1 ml of conc. HCl in 25 ml of anhydrousmethanol was transferred to the reaction flask. The reaction was stirredat reflux temperature (˜85° C.) overnight. The reaction was cooled downin an ice bath for ½ h and then filtered through P8 filter paper. Theprecipitate was dissolved in 50 ml of water and neutralized usingNaHCO₃. The xylan was reprecipitated with 100 ml of ethanol and theprecipitate was separated by filtration through P8 filter paper anddried at 70° C. overnight.

Step 3: Benzylation. The residue was placed into a dry RB flask (flushedwith nitrogen), followed by the addition of dry DMSO (210 mL). Themixture was stirred at 70° C. for 3 h, cooled to room temperature, andthen powdered potassium hydroxide (9 mol KOH per mol OH group) wasadded. A catalytic amount of crown ether (benzo-18-crown-6) was added tothe mixture. The mixture was stirred overnight at 50° C. An ice bath wasused to cool the mixture down to 0° C., followed by dropwise addition ofbenzyl bromide (3 mol benzyl bromide/mol OH group) through a septum.After 15 minutes, the temperature was increased to 70° C. and themixture was stirred overnight. The mixture was cooled in an ice bath,methanol (500 mL) was added to precipitate the polysaccharide, and thexylan was separated by filtration though a P8 filter paper. Theprecipitate was washed at least twice in methanol, ethanol or acetone.The wash could be on filter paper, or with resuspension in a beaker withor without stirring and at room temperature or up to 70° C. followed byfiltration through P8 paper. Sometimes the xylan was resuspended inwater and neutralized, followed by precipitation and drying orlyophilization.

Step 4: Triflation. The benzylated xylan was added to a dry RB flaskthat was flushed with nitrogen. Pyridine (3 mL) was added to the flaskand the solution stirred for 30 min at 60° C. The reaction mixture wascooled to room temperature, then to 0° C. using an ice bath, and finallyto −20° C. with an isopropanol bath (stored at −80° C.). Triflicanhydride (1-2 mL) was carefully added to the solution. The reaction wasstirred for 15 min at −20° C., then at room temperature overnight. Themixture was frozen at −80° C., and subsequently lyophilized. Ethanol(95%) was used to wash the residue for 15-30 min. The mixture wasfiltered through P8 paper, then the residue dried over vacuum for 15min, and finally lyophilized.

Step 5: Bromination. The triflated material was placed into a dry RBflask and an air condenser was attached to the flask. Dioxane (100 mL)was added to the flask, and nitrogen was used to purge the system ofmoisture. Dry tetra-n-butylammonium bromide (10.65 g) was added throughthe septum on the condenser. The mixture was refluxed at 105° C. for 24h, cooled to room temperature, frozen and finally lyophilized. Ethanol(300 mL) was added to the residue and the mixture was stirred for 1 h.The mixture was filtered and the residue was washed two more times,followed by lyophilization.

Step 6: Addition of 4-MU. 4-MU was added to a dry RB flask followed byaddition of dry DMSO (25 mL). The flask was placed into an ice bath andsodium hydride (0.5 g) added dropwise. The flask was purged of moistureusing nitrogen. The reaction was allowed to stir for 30 min, followed bythe addition of the brominated xylan. The reaction was stirredovernight. Ethanol (100 mL) was added slowly to the flask at 0° C. Thesuspension was filtered, and the precipitate was washed three times withethanol and lyophilized. The preparation of xylan derivatized with 4MUwas qualified by similar experiments to those used on derivatized LBG,including digestion with commercially available α- and β-xylosidases.

For bioprospecting using xylan derivatized with 4MU, soil samples weretaken beneath sites of well-rotted hardwood in Maine forest. Asdescribed above, the soil was shaken with a balanced salt solution tosuspend soil microbes. Soil and debris were allowed to settle for 10-30minutes, and the supernatant was decanted. Optionally, the supernatantmay be diluted. See Connell, L. et al., 2006, supra. The suspendedmicroorganisms were inoculated into a sterile minimal medium (HBSScontaining 2 ml/L of trace element mix from American Type CultureCollection, +0.1-1.0% birchwood xylan (Sigma Cat. No. X0502) as the soleor major carbon source. The xylan was either 4-MU-derivatized, native,benzylated (xylan prepared for derivatization with 4MU, but thepreparation stopped after benzylation [see Step 3 above]) or a mix. Inaddition, a modification of a medium formulation that has been used toisolate anaerobic bacteria was used. See Warnick, T. A. et al., 2002,supra. However, because previous researchers have found that ether bondsare more easily broken at more extreme pH values, the pH of the mediawas adjusted to pH 4.5-5. See Alexander, M., (1965), supra. Eachformulation of medium was incubated both aerobically and anaerobicallyunder a nitrogen atmosphere. The cultures were incubated in the dark atroom temperature and checked every week for the development offluorescence.

When fluorescence developed and began to decrease in the initial flasks,1 ml of culture was diluted into 10 ml of fresh medium as above as anenrichment step. The fluorescence was monitored at least once per week.Typically, fluorescence once again began to rise and eventually todecrease. When fluorescence began to decrease, the enrichment processwas repeated. Concurrently with each enrichment step, a sample of theculture was plated directly onto agar plates made with the growth mediumdescribed above but using xylan instead of 4MU-xylan. Individualcolonies were examined under an inverted microscope for morphology andgrowth characteristics. Well separated individual colonies were pickedand streaked onto fresh plates. The process was continued until at least3-4 replicates of each morphology and growth type were obtained. Eachisolate was then grown under the same conditions as in the enrichmentflasks to see if fluorescence developed. Those isolates that did notdevelop fluorescence were discarded.

Any fluorescence that developed in cultures of the isolated coloniescould have been due to generation of 4MU from the substrate and/or toautofluorescence. To eliminate those colonies that were merelyautofluorescent and did not generate 4MU from the substrate, strains ofisolates were grown in liquid suspension cultures in the isolationmedium. Once fluorescence had developed, spent medium was sampled andthe cells were removed either by centrifugation or by filtration througha 0.22μ syringe filter. Fresh 4MU-derivatized xylan (4MU-X) was added tothe cell free medium and the development of fluorescence was monitoredover time. A lack of continued increase in fluorescence could be due tothe absence of synthesis of autofluorescent compounds, or to the absenceof metabolic recharging of required energy or redox co-factors, or toloss of enzyme activity anchored on cell membranes. Increasedfluorescence was interpreted as evidence for the presence of a solubleenzyme activity that did not require an energy or redox cofactor.

Surprisingly, 12 prokaryote strains and 1 mycelial fungus were isolatedwhose cell-free culture supernatant could release 4MU from 4MU-X. Someof the 12 prokaryotes were closely related to one another. For example,three prokaryotes were different Paenibacillus strains, and two isolateswere different Burkholderia strains. One of the Paenibacillus strains,E518, was more extensively studied. Its rDNA sequence's closesthomologies were Paenibacillus sp. Y412MC10, Paenibacillus polymyxa, andPaenibacillus terse HPL-003.

Novel Enzymes that Cleave Non-Glycosidic Bonds Between Hemicellulose andLignin or Derivative Thereof or a Lignin Mimic

The activity that cleaved non-glycosidic phenolic ether bonds betweenlignin and mannan from B603 and the activity that cleaved non-glycosidicphenolic ether bonds between lignin and xylan from E518 were shown to bedue to proteinaceous enzymes by a pre-digestion of the culturesupernatants with proteases. The incubation with protease destroyedactivity for both B603 supernatant and E518 supernatant. An example of aprotease digestion experiment is shown in Table II for B603. Theseresults were repeated with multiple different proteases for both B603and E518 supernatants.

TABLE II Effect of S. griseus protease on HLE activity Percent RelativePercent Relative to to Control- Control-Protease Substrate UntreatedTreated Fluorogenic model 100% 19.8% substrate 1 Fluorogenic model 100%21.4% substrate 2 Culture supernatant from B603 was filtered through0.45μ filters to remove cells and incubated with 2 differentpreparations of fluorogenic model compound for 24 hrs at 24° C. Numbersshown are the averages of the two replicates.

The activity isolated from B603, is referred herein as MLE(mannan:lignin etherase). MLE is the only enzyme known to cleave 4MUfrom 4MU-LBG. Glycosidases that have been tested against 4MU-LBG and donot release 4MU are listed in Table III.

The activity isolated from E518, is referred herein as XLE,(xylan:lignin etherase). XLE is the only enzyme known to cleave 4MU from4MU-xylan (see Table III). Similarly, none of the carbohydrate-activeenzymes tested released 4MU from 4MU-xylan. MLE also does not release4MU from 4MU-xylan.

TABLE III Effect of Various Carbohydrate-Active Enzymes on 4 MU-LBG and4 MU-Xylan Enzymes Unable to Cleave 4 MU Enzymes Able to Cleave4MU fromeither 4 MU-LBG or 4 MU-Xylan from 4 MU-LBG Cellulase from Trichodermareesei MLE ATCC 26921 Hemicellulase from Aspergillus niger Isoamylasefrom Pseudomonas sp. Enzymes Able to Cleave 4 MU Pullulanase fromKlebsiella pneumoniae from 4 MU-Xylan Xylanase from Thermomyceslanuginosus XLE α-Amylase from A. oryzae α-Galactosidase from greencoffee beans α-Mannosidase from Canavalia ensiformis β-Galactosidasefrom Aspergillus oryzae β-Mannosidase from Helix pomatia β-Xylosidase(CAZyme Xylosidase 1)

For all the enzymes listed in Table III, control experiments withcommercially available substrates were tested in parallel to confirmthat the enzymes were active and that the experimental conditions wereconsistent with activity of the enzymes. Where conditions permitted,internal controls were used, as previously shown and discussed in TableI.

Table IV lists some of the glycosidase substrates tested with MLE underconditions in which MLE is known to be active (pH 5-5.5, 50-150 mM ionicstrength, 30° C., presence of a trace mineral supplement). Eachsubstrate was shown to be cleavable by a known enzyme in parallel to theMLE digestion. In addition, every experiment included a parallelexperiment in which the MLE preparation was tested against 4MU-LBG toensure that the MLE used was active.

TABLE IV Effect of MLE on Various Glycosidase Substrates SubstratesUnaffected by MLE Substrates Cleaved by MLE Hydroxyethylcellulose dyedwith Ostazin red 4 MU-LBG 4-O-Methyl-D-glucurono-D-xylan-Remazol blueGlycogen azure Azo-carob galactomannan Carboxymethyl cellulose4-Methylunbelliferyl-β-D-lactoside4-Methylumbelliferyl-α-D-mannopyranoside 4-Methylumbelliferyl-β-Dmannopyroside 4-Methylunbelliferyl-β-D xylopyranoside 4Nitrophenyl-α-D-mannopyranoside 4 Nitrophenyl-β-D-mannopyranosidep-Nitrophenol-β-D-glucopyranoside Starch Azure 4-Methylunbelliferyl-β-Dxylan

Recombinant MLE (rMLE) also shows activity against native cellulosicbiomass (see below).

Since the activity of MLE is novel, there was no available genetic orprotein probe to use to isolate the gene of interest from B603. The genewas cloned based on its encoded enzyme's activity against themacromolecular substrate, 4MU-LBG. Because 4MU-LBG is a mix of verylarge macromolecules, there was little likelihood that it would be ableto enter the cells and encounter recombinant enzyme once it wasexpressed in a cloning host such as Escherichia coli. Consequently, abacteriophage lambda cloning system was chosen. Bacteriophage lambdacauses extensive cell lysis, releasing the recombinant protein into thesurrounding medium, where it can come into contact with its potentialsubstrate, 4MU-LBG.

A cDNA library was prepared from B603 mRNA isolated from cells that wereexpressing the enzyme. To obtain the mRNA, B603 cells were grown inmedium containing 4MU-LBG as the sole carbon source (see above forcomplete medium composition). At times bracketing the usual expressiontimes for MLE, approximately 200 ml of cells were harvested and samplesof the culture supernatant were assayed for MLE activity. Because assayfor MLE activity takes at least 24 hours, RNA was prepared at all timepoints, and those preparations of RNA from cells which were notexpressing MLE were discarded. Total RNA was prepared using Ribo-Purekit for bacteria from Ambion (now Life Technologies, Grand Island, N.Y.14072). Briefly, cell walls were disrupted by mixing cells with an RNaseinhibitory solution, and vortexing them with zirconia beads. The lysatewas then extracted with chloroform to yield an upper aqueous phase thatcontained the RNA. The RNA was further purified by dilution with ethanoland bound to a silica filter followed by an aqueous, low ionic strengthelution. The RNA was stored frozen at −80° C. until needed. In somecases, the cell pellets were frozen in liquid nitrogen, stored at −80°C., and the RNA extraction was performed at a later time point. RNApreparations or frozen cell pellets from time points at which no MLEactivity was detected in culture supernatants were discarded. Quality ofthe RNA preparation was checked by gel electrophoresis.

The 16S and 23S rRNA background was reduced using the MICROBExpress™Bacterial mRNA Enrichment Kit from Ambion (now Life Technologies, GrandIsland, N.Y. 14072) following manufacturer's directions. The kitcontained magnetic beads derivatized with oligosequences complementaryto conserved regions of 16S and 23S prokaryotic ribosomal RNA. A largepart of the rRNA in the B603 RNA preparation hybridized to the beads andwas removed from the solution, enriching the preparation for mRNA. Thepurified mRNA was reverse-transcribed using random primers and ligatedinto the XhoI-EcoR1 site of Lambda-ZAP II according to manufacturer'sdirections (Stratagene, now Agilent Technologies, Santa Clara, Calif.95051). The titer of the library was measured as recommended by themanufacturer.

An aliquot of the λ library was plated onto a lawn of XL1-Blue E. coliaccording to manufacturer's directions (Stratagene, now AgilentTechnologies, Santa Clara, Calif. 95051) except that 4MU-LBG (0.25% w/v)which had been briefly treated with a commercial hemicellulase wasincorporated into the top agarose. Once plaques were visible in the topagarose, the plates were overlaid with PVDF membranes that had beenwetted in methanol and rinsed in sterile Highley's buffer to remove themethanol. The lifts were rinsed briefly in 0.1M borate buffer, pH 9.0 tointensify the 4MU fluorescence and examined under shortwave UV light.The plaques corresponding to the fluorescent spots on the filter paperwere excised as agarose plugs, and the phage contained in the plugs wereeluted into SM (0.1M NaCl, 8 mM MgSO₄, 50 mM Tris, pH 7.5). The elutedphage were replated and rescreened as before until all plaques on theplate were fluorescent in the plaque lift assay. An additional round ofplating, screening, and picking of an isolated plaque from theputatively purified phage clone was performed to guarantee purity.Individual fluorescent plaques were selected from the final pure plateand eluted from the top agarose into SM with 50% glycerol to create afreezer stock. Each freezer stock was used to create plasmids inBlueScript using LambdaZapII's autosubcloning feature according tomanufacturer's directions (Stratagene, now Agilent Technologies, SantaClara, Calif. 95051). Several well separated colonies of E. colicontaining the plasmids were used to create independent freezer stocksfor each putative positive. See Sambrook et al., 1989, supra.

Some false positives were eliminated by testing the phage stocks forinducible expression. Each independently cloned phage stock was platedonto an E. coli lawn in the absence of fluorogenic substrate, allowed togrow overnight and lifted onto PVDF membrane as before. Plaque liftswere rinsed with 0.1M borate buffer, pH 9, dried, and examined undershort-wave UV light. Those phage preparations that were stronglypositive in the absence of a fluorogenic substrate were tentativelyconsidered to be false positives. Sequence analysis of the falsepositives was performed using primers based on the T3 and T7 sites ofthe Bluescript vector. Most of the false positives were clearlyribosomal DNA, probably resulting from rRNA that was not removed duringthe mRNA enrichment. These inserts had no significant open readingframes and appeared to fortuitously produce short peptides that had someintrinsic fluorescence. Other false positives also appeared to produceshort fluorescent peptides.

The phages from the remaining positives were amplified in broth culture,and supernatant from the lysates was filtered and incubated with 4MU-LBGto determine whether activity against the substrate was present. Thoseclones without activity against the substrate were discarded.

The phagemids from positive lambda isolates were excised from the restof the lambda DNA and transformed into SOLR cells using the ExAssisthelper phage, following the manufacturer's protocol (Stratagene, nowAgilent Technologies, Santa Clara, Calif. 95051). The cells harboringthe excised phagemids were plated, and single, well-isolated colonieswere picked for characterization. 5 to 10 individual colonies werepicked from each plated excision. Each colony pick was amplified inbroth and the plasmids extracted using the Genecatch Plus PlasmidMiniprep Kit (Epoch Life Science, Sugar Land, Tex. 77496). The plasmidswere digested with KpnI, and EcoRI/KpnI (Promega Corporation, Madison,Wis., 53711), and the bands separated by gel electrophoresis to confirmthat all of the inserts in the phagemid were the same for eachparticular isolate. The size of the insert was also estimated from thegel at the same time.

Once it was certain that an isolate was a pure culture, a secondactivity confirmation was carried out. The cells were grown, and inducedwith IPTG (isopropylthiogalactoside) following the manufacturer'sprotocol (Stratagene, now Agilent Technologies, Santa Clara, Calif.95051). The cells were lysed by sonication and the cell lysate was usedin an activity assay against 4MU-LBG.

Ultimately, a subclone (clone 17-2) of a single plaque from the libraryscreen was isolated that had an IPTG inducible activity against 4MU-LBGand an insert size of approximately 800 bp. The sequence of the positiveinsert was determined by sequencing the phagemid with standard T3/T7primers. The sequences from the forward and reverse primers were readusing Chromas (Technelysium Pty Ltd, South Brisbane QLD 4101, Australia)or Geneious 5.0 (Biomatters Inc. San Francisco, Calif. 94107) software,and a consensus sequence was generated. The ˜800 bp insert contained anopen reading frame of 582 bp, corresponding to a 193 amino acid peptidewith a calculated pI of 5.97. The nucleotide sequence (SEQ ID NO:3) andtranslated amino acid sequence (SEQ ID NO:4) of the open reading frameare both shown in FIG. 5. Since there was no identifiable ribosomebinding site or −10 sequence, it was concluded that the cDNA encoded anactive fragment of the complete polypeptide. As shown in FIG. 6, thenucleotide sequence of the gene fragment from the phagemid (SEQ ID NO:3)showed 75% identity with the nucleotide sequence of a glycogendebranching enzyme from Burkholderia glumae BGR1 (SEQ ID NO:5).

Genomic DNA was extracted from B603 strain using the cetyltrimethylammonium bromide (CTAB) method with multiple phenol extractions. SeeAusubel, F. et al., Short Protocols in Molecular Biology, Wiley andSons, NY (1995), which is hereby incorporated by reference in itsentirety. The genomic DNA was digested with a panel of restrictionendonucleases and the digests were electrophoresed on agarose gels.Biotin-labeled probe to clone 17-2 was prepared by PCR using a kit fromJena Bioscience GmbH (D-07749 Jena, Germany) according to manufacturer'sdirections. The probe was prepared using primers G3-1 and G3-2 (seeTable V), resulting in a 558 bp biotinylated oligonucleotide, which wasseparated from unincorporated nucleotides using a PCR purification kit(Promega Corporation, Madison, Wis., 53711), again according tomanufacturer's instructions. A Southern blot analysis using thebiotinylated probe indicated that the gene corresponding to clone 17-2is single copy and is contained in an EcoR1 fragment of approximately6-7 kb (FIGS. 7A and 7B). See Ausubel, F. et al., 1995, supra.

“Walking” and genomic cloning strategies were used to obtain thecomplete gene sequence (see SEQ ID NO:1), along with the 5′ and 3′untranslated regions of the cDNA. Primers were designed to regions justupstream of the translation start site and downstream of the stopsignal. The OligoCalc oligonucleotide properties calculator was used todetermine primer fitness. See Kibbe, W. A., OligoCalc: an onlineoligonucleotide properties calculator, Nucl. Acids Res., 35(2):W43-W46(2007), which is hereby incorporated by reference in its entirety. Theprimers that were used to determine the sequence are listed in Table V.

TABLE V  Primers for Sequencing the MLE Gene andIts Surrounding Gene Regions SEQ ID Name Specificity Sequence (5′ to 3′)NO. G3-1 upstream clone AGCTGCGATCGCCACGAGGGTGAAGCGCGCCAT 8 G3-2downstream clone GTGCGTTTAAACTGCCGGTTCGGTCCGGACAAT 9 G3-3 161 internalGGAGCTGACCGACTTCGTGGCGCGGCTGG 10 G3-4 161 reverseCCAGCCGCGCCACGAAGTCGGTGAGCTCC 11 G3-5 281 internalAGGTGGCATGGTTCGACGAGAGTGG 12 G3-6 281 reverse CCACTCTCGTCGAACCATGCCACCT13 G7 G7 upstream reverse CGTTGGCGTCGTTGTGTTTGTCGTTGT 14 G8G8 upstream reverse CTTCGCCGTTGGCGTCGTTGTGTTTGTCG 15 G9 BiotinG9 biotin internal GCGACGCCCGAGACCCATGTGTTC 16 Southern probe Sac1.7P1Fprimer for walk GGGCAATGTCGAGATCG 17 Sac1.7P1R primer for walkTTCTCCACCGGCAGGG 18 F − 519 upstream forward GATCACCAGCGGCGAAAGCCCT 19primer R − 519 upstream reverse AGGGCTTTCGCCGCTGGTGATC 20complement of −519 F + 718 downstream forward GATCGCGCAGTTTCCCGGTGAG 21primer R + 718 downstream reverse CTCACCGGGAAACTGCGCGATC 22complement of 718 F + 1205 downstream forward CGACGACTTCCACAATGCGCTGCAC23 primer R + 1205 downstream reverse GTGCAGCGCATTGTGGAAGTCGTCG 24primer Fminus474 upstream forward GCGCACGACGGCTTCACGCTG 25 primerRminus474 upstream reverse CAGCGTGAAGCCGTCGTGCGC 26 primer cc148Rupstream reverse GCTCGGGCGCGAAGAAGGCAAGCGTG 27 primer HLE 2A1bF6712A1 from F993 CTGCGAGGCAAGGATAACGAAGAGC 28 middle of seq 4A5bF6004A4 from R849 CAAGCCATGCACGCCGGGATACCG 29 middle of seq forwardhle2up554 middle of cc148R CATGTTCATAGCCGACTGACGAGGAAATC 30seq upstream HLE

As described above, genomic DNA has been shown by Southern blotting tocontain an EcoR1 fragment of 6-7 kB that contains the MLE gene (FIG.7B). For genomic cloning, a large Eco R1 digest of genomic DNA waselectrophoresed on a preparative gel and the band region between 6 and 7KB was excised. The DNA was purified using a Gene Jet Gel extraction kit(Thermo Fisher Scientific, Inc., Waltham, Mass. 02451) and the purifiedDNA was ligated into pUC19. The plasmids were transformed into an E colihost, and the transformants were diluted and spread on 150 mm petridishes. Once the colonies were grown, they were overlaid with sterileHi-Bond N membranes (Amersham). The adherent cells were lysed in situusing 0.5N NaOH, neutralized with 1M Tris-HCl, pH 7.5 and washed in 0.5M Tris-HCl, pH 7.5, 1.25M NaCl. See Ausubel, F. et al., 1995, supra. DNAon the membranes was cross-linked to the membranes with short-wave UVlight. The blots were processed as for Southern blots using a probeprepared by PCR from genomic DNA using primers G3-1 and G3-2 (see TableV) and a kit from Jena Bioscience GmbH (D-07749 Jena, Germany) accordingto manufacturer's directions. The PCR resulted in a 558 bp biotinylatedoligonucleotide that was separated from unincorporated nucleotides usinga PCR purification kit (Promega Corporation, Madison, Wis., 53711),again according to manufacturer's instructions. Areas of the blotsreacting with biotinylated probe was detected withstreptavidin-conjugated to alkaline phosphatase and visualized withnitroblue tetrazolium and 5-bromo-4-chloro-3-indolyl phosphate. SeeAusubel, F. et al., 1995, supra. Colonies corresponding to purple spotswere excised and replated until 100% of the colonies on the plate werepositive when a membrane overlay of the plate was reacted with theprobe. Then a single well-separated colony was chosen as a stock. Thepresence of the insert was confirmed by PCR with relevant primers.Several independent genomic clones containing an insert were isolatedand sequenced.

Once plasmids containing the desired gene region were purified, theupstream and downstream regions were sequenced by “walking” New primerswere made based on the known gene sequences to sequence upstream anddownstream from the known region, as discussed above (see Table V). Whenthe new regions were sequenced, new forward and reverse primers weredesigned to amplify more upstream or downstream gene region, as well asread back to the previously known region to confirm the sequence. Theupstream and downstream gene regions of the MLE gene are shown in FIGS.10A and 10C and designated as SEQ ID NOS:6 and 7, respectively.

DNA sequences determined from clones and from walking experiments wereproofread using an on-line Geneious sequence analysis software(Biomatters Incorporated, San Francisco, Calif. 94107) supplemented withmanual inspection and sequence reconciliation.

The gene encoding MLE, designated herein as SEQ. ID NO:1 (see FIG. 8B)was determined to have an EcoR1 site upstream of the original fragmentisolated from the bacteriophage lambda library, and consequently thegenomic fragment isolated from the EcoR1 digest did not contain theentire MLE coding region. Therefore, a second round of genomic cloningwas carried out as above using a Pst1 digest of genomic DNA instead ofan EcoR1 digest.

Although the original cDNA isolated from the lambda library appeared tocontain the 3′ end of the MLE gene, the region downstream was alsosequenced to confirm this hypothesis. 4400 bases of sequence upstream ofthe known fragment were determined. The genes and promoter and startsites were predicted from this sequence using an on-line Softberryanalysis (Softberry, Inc. Mount Kisco, N.Y. 10549).

DNA sequences from all stages of the genomic sequencing project wereassembled into a consensus sequence using the on-line Geneious software(Biomatters Inc., San Francisco, Calif. 94107) and BLAST (Basic LocalAlignment Search Tool) online software accessible from the NCBI website,and the predicted sequence was analyzed with the ExPASY translation toolfrom SIB, the Swiss Institute of Bioinformatics. See Artimo P. et al.,ExPASy:SIB bioinformatics resource portal, Nucl. Acids Res.,40(W1):W597-W603, 2012, which is hereby incorporated by reference in itsentirety. The genomic cloning project yielded a DNA sequence with asingle open reading frame containing the cDNA sequence. The open readingframe predicted a polypeptide of 702 amino acids, as shown in SEQ IDNO:2 (see FIG. 9). The predicted complete polypeptide showed a 65%identity to its best match against the National Library of Medicinedatabase using BLAST. That match is a hypothetical protein fromHerbaspirillum massiliense that appears to belong to the glycogendebranching family.

The DNA sequence of the originally isolated cDNA clone (SEQ ID NO:3)contains a single base change from the same region of the genomic DNA(SEQ ID NO:1). The genomic sequence (SEQ ID NO:1) had an adenine (A) atposition 2047, giving a codon of ATT, which encodes isoleucine (I). Inthe equivalent position, the original non-genomic sequence (SEQ ID NO:3;catalytic fragment) had a thymine (T), yielding a codon of TTT andencoding a phenylalanine (F). In short, the genomic DNA encoded anisoleucine at amino acid position 683 (SEQ ID NO:2), but the originalcDNA clone expressed a phenylalanine at the same position (see SEQ IDNO:4). Both the isoleucine-containing polypeptide encoded by the genomicclone (SEQ ID NO:2) and the phenylalanine-containing peptide encoded bythe cDNA clone (SEQ ID NO:4) have activity against the substrate.

The restriction map for the complete sequence was determined usingRestrictionMapper, an on-line restriction mapping program developed byPeter Blaiklock. The information from all of these sources was used toselect restriction enzymes to excise the gene from the genome to createa complete cDNA clone (as shown in SEQ ID NO:1), as well as to generateappropriate cloning primers with restriction enzyme recognition sitesfor cloning the gene into various plasmid expression vectors.

Clones of the complete genomic MLE DNA (SEQ ID NO:1) and the c-terminalMLE cDNA (SEQ ID NO:3) expressing amino acids 509 through the C-terminusat amino acid 702 (SEQ ID NO:4) and possessing catalytic activity havebeen constructed (see Table VI).

TABLE VI Constructs expressing containing complete and C-terminal MLEVector Expression Insert (and source) Host(s) inducer Expression tagActivity Complete pHIS525 B. megaterium xylose C-terminal 6X his Nonegenomic MLE (MoBiTec GmbH)* B. subtilis Catalytic (cDNA) pHIS525 B.megaterium xylose C-terminal 6X his None fragment (MoBiTec GmbH)* B.subtilis Complete pAES40 E. coli IPTG/lactose C-terminal 6X his Goodgenomic MLE (Athena Environ. Sci.)** Catalytic pAES40 E. coliIPTG/lactose C-terminal 6X his Poor (cDNA) fragment (Athena Environ.Sci.)** Complete pHT43 B. subtilis IPTG None Poor genomic MLE (MoBiTecGmbH)* Catalytic pHT43 B. subtilis IPTG None Good (cDNA) fragment(MoBiTec GmbH)* Complete Bluescript SK⁻ E. coli IPTG None Poor genomicMLE (Stratagene) Catalytic Bluescript SK⁻ E. coli IPTG None Fair (cDNA)fragment (Stratagene) Complete pFN6A E. coli IPTG N-terminal Halo-Tag ®Poor genomic MLE (Promega) (MKHQHQHQAIA) Catalytic pFN6A E. coli IPTGN-terminal Halo-Tag ® Poor (cDNA) fragment (Promega) (MKHQHQHQAIA)*MoBiTec GmbH, 37083 Göttingen Germany **Athena Environmental Science,Baltimore, MD 21227Effect of rMLE on Native Substrates

Small Scale Treatment of the Pulp with rMLE Active Fragment—LigninRemoval.

A small sample of kraft softwood pulp was combined with a cell-freesupernatant from E. coli expressing a pBluescript SK⁻ fusion protein ofthe active site of HLE fused to the α-peptide of β-galactosidase.Following incubation, the pulp was pelleted by centrifugation, thesupernatant was sterile-filtered to remove any particulates, and thesupernatant was lyophilized and redissolved in 0.1 volumes of distilledwater. As shown in FIG. 10, colored material was removed from the pulp.Standard TAPPI protocols for quantitative measurement of lignin releasecall for measurement at OD₂₈₀ for soluble lignin (See Dence, C. W., Thedetermination of lignin, In: Methods in Lignin Chemistry, S. Y. Lin andC. W. Dence (eds), pp. 33-61, Springer-Verlag, Berlin Heidelberg (1992),which is hereby incorporated by reference in its entirety) or at OD₂₀₅for the release of acid-soluble lignin (See TAPPI UM 250, Acid-solublelignin in wood and pulp, In: Technical Association of the Pulp and PaperIndustry Useful Methods, 1991 TAPPI, Atlanta, Ga. 1991, pp. 47-48, whichis hereby incorporated by reference in its entirety) because of possibleinterference by furfurals formed during acid treatment. OD₂₀₀ has beenfound to be more effective for softwood lignins See Maekawa, E., Anevaluation of the acid-soluble lignin determination in analysis oflignin by the sulfuric acid method, J. Wood Chem. Technol. 9(4):549-569(1989), which is hereby incorporated by reference in its entirety.However, cell culture supernatants already have significant amounts ofmaterial absorbing at UV wavelengths that interfere with ligninmeasurement. An alternative measurement was developed.

Culture supernatant containing HLE activity was combined withkraft-cooked softwood pulp. A sample was withdrawn immediately or after24 hrs of incubation at 30° C. Pulp was removed by centrifugationfollowed by filtration, and optical density was measured at 205 nm, 280nm and 405 nm wavelengths. To read OD at 205 nm, the supernatant wasdiluted 500 fold. To read OD at 280 nm, the supernatant was diluted 8fold. To read OD at 405 nm, the supernatant was not diluted. In culturesupernatants, aromatic and other organic compounds formed and releasedduring growth of E. coli may be interfering with the measurement ofmolecules released from pulp. However, measurement at 405 nm toquantitate the yellow-orange color released was effective. The color maybe due to the conversion of lignin subunits to quinones. See Agarwal U.P., Assignment of the photoyellowing-related 1675 cm Raman/IR band top-quinones and its implications to the mechanism of color reversion inmechanical pulps, Journal of Wood Chem. and Technol. 18(4):381-402(1998) and Spender, J., Photostabilization of High-Yield Pulps Reactionof Thiols and Quinones with Pulp, a master's thesis for the Departmentof Chemistry, University of Maine (2001), both of which are herebyincorporated by reference in their entirety.

TABLE VII Solubilization of Lignin Measured at Different WavelengthsSample A₂₀₅ A₂₈₀ A₄₀₅ M9 medium 0 0 0 Culture supernatant mixed −0.069−0.297 .159 with pulp, time 0 Culture supernatant mixed −.272 −.4830.363 with pulp, time 24 hrsPilot Scale Treatment of Pulp with rMLE Active Fragment. Effect on PulpProperties.

Clone 17-2 in phage λ was excised in vivo in using a helper phage asrecommended by the manufacturer (Stratagene, now Agilent Technologies,Santa Clara, Calif. 95051) to yield the active fragment of MLE, aminoacids 509-702 (SEQ ID NO:4) of the complete polypeptide (SEQ ID NO:2)fused in frame to the α-peptide of β-galactosidase. The construct in E.coli BL21 was grown overnight in 5 or 7 liter New Brunswick Scientificbioreactors (Enfield, Conn. 06082) in M9 medium containing 0.4% glycerolas the sole carbon source and 2 ml per liter of Trace Minerals (ATCC),and expression of the fusion protein was induced with 1 mM IPTG. Thefusion protein was not exported actively into the medium, resulting in avery low concentration of enzyme in the culture supernatant.

Approximately 800 g of acid washed and oven-dried softwood kraft pulp at15.37% consistency was washed 3× with distilled water, about 20 literseach wash. Between washes, the pulp was wrung dry in small batches infine mesh bags. The pulp was then washed three times in MM9 (modified M9medium containing 0.4% glycerol as the carbon source and 2 ml per literof Trace Mineral Mix from ATCC). Between washes, the pulp was wrung dryas above. A small sample of the pulp was oven-dried to determine thatthe final consistency was 25.4%. 850 g of the pulp (corresponding toabout 216 g of oven-dry pulp) was placed in each of 3 buckets. Buckets 1and 2 were combined with 5 liters of fresh MM9. The third bucket wascombined with 5 liters of culture supernatant. The pH of each bucket wasadjusted to pH 5.0-5.5 with 1M citric acid. Each pulp mixture washeat-sealed into a plastic bag and incubated at 30° C. on a rotatingplatform for 24-36 hours. This process was repeated 2 times.

Between the incubations, the pulps were squeezed dry and usually addedimmediately to fresh MM9 or fresh culture supernatant. In some cases itwas necessary to store the pulps for 1-2 days between incubations. Inthose cases, the pulps were individually washed several times withwater, squeezed dry, and stored at 4° C. Before use, the individualpulps were again washed in fresh MM9 several times, squeezed dry, andthen fresh MM9 or culture supernatant was added.

Following the incubations in MM9 and culture supernatants, the pulpswere washed ×3 in water, squeezed dry, and resuspended in a final volumeof 5 liters of 20 mM citrate buffer, pH 3.5. To Pulp 2 was added 5MU ofcommercial isoamylase (Sigma-Aldrich Corporation, St. Louis, Mo. 63178)to give a final concentration of 1 KU/ml. The pulps were incubated at45° C. Pulps 1 and 3 were stopped after 2 hours. Pulp 2 was incubatedovernight at 45° C. After the incubations were stopped, each pulp waswashed extensively in distilled water.

The pulps were then delignified in a pilot scale oxygen delignifier atthe Process Development Center (PDC) at the University of Maine. PDCmeasured a variety of pulp parameters in the starting pulp and the threeexperimental pulps. The results are summarized in Table VIII.

TABLE VIII Effect of MLE pretreatment on Oxygen Delignified Pulp Pulptreatment MLE Isoamylase Change relative (Probability of (Probability ofto buffer significance compared to significance compared treatedcontrols control by t-test) to control by t-test) Kappa number −0.3(87%) +0.1 (<50%) Brightness −0.3 (81%) −0.3 (81%) Intrinsic viscosity −10 (94%)   −2 (<50%)Effect of rMLE on Hardwood Biomass Substrate

Samples of hardwood biomass substrates pretreated with two differentproprietary regimes were generously supplied by Mascoma Corporation(Waltham, Mass. 02451). The pulps were washed three times with 10 mMsodium citrate, pH 5.5 in HBSS by centrifugation to lower the initialpH. Aliquots of 0.5 g washed pulp were incubated with either 0.5 ml ofcell-free culture supernatant or 0.5 ml of culture supernatant from thecorresponding untransformed host strain. Each incubation was carried outat 29° C. with gentle shaking over the course of 72 hours. The sampleswere returned to Mascoma Corporation (Waltham, Mass. 02451) forsaccharification to monosaccharides and subsequent quantification byHPLC analysis. For Substrate 1, the MLE catalytic fragment (SEQ ID NO:4)increased glucose concentration by about 20%, but the full length rMLEhad little or no effect. For Substrate 2, the reverse held true. Thecatalytic fragment had no effect, and the full length MLE (SEQ ID NO:2)increased glucose concentration by about 5%. Treatment of Substrate 1with the catalytic fragment also showed an increase in xylose recovery.However, treatment of the pulps with the complete recombinant MLE hadlittle effect on xylose recovery.

TABLE IX Effect of Recombinant MLE on Hardwood Pulp Sugar Recovery-%increase over relevant control. Treatment Complete Catalytic genomicPulp fragment¹ MLE² Substrate 1 20.4% −0.2% Glucose content, g/LSubstrate 1 21.7%   4.3% Xylose content, g/L Substrate 2  0.1%   5.1%Glucose content   Substrate 2  0.0%   0.0% Xylose content ¹Culturesupernatant from a B. subtilis host transformed with pHT43 to expressthecatalytic fragment of MLE fused to an export sequence, grown in M9minimal medium and induced with IPTG. ²Culture supernatant from an E.coli host transformed with pAES40 to express the complete MLE fused toan export sequence, grown in M9 minimal medium and induced withIPTG/lactose.

Example 6 Cloning of xle

XLE activity was identified using zymography, an electrophoretictechnique that reveals protein bands based on their enzymatic activity.Non-denaturing polyacrylamide gels were based on the original Laemmliformulation of polyacrylamide gels, (see Laemmli, “An efficientpolyacrylamide gel electrophoresis system for proteins separation.”Nature 227: 690-695 (1970), which is incorporated herein by reference inits entirety), but sodium dodecyl sulfate and β-mercaptoethanol wereeliminated from the running gel, stacking gel and sample buffer. E518cells were grown in shake flasks at 34° C. in HBSS containing 2 ml per Lof trace element mix from American Type Culture Collection (Manassas,Va. 20110) containing either 0.4% oligoxylose (Cascade AnalyticalReagents and Biochemicals, Corvallis, Oreg.) or a mix of 0.35%oligoxylose and 0.05% benzylated xylan. Samples were taken from eachculture after 29 hours of growth at 34° C. Each sample was desalted vsHBSS without ammonium nitrate and concentrated on a spin column with amolecular weight cut-off of 2 KD preconditioned according tomanufacturer's protocol (Sartorius Stedim North America Inc. Bohemia,N.Y. 11716) or 3 KD (Amicon, EMD Millipore, Billerica Mass. 01821) andlyophilized. After lyophilization, each sample was dissolved in about1/150 of its original volume.

Each sample was run in duplicate on the same gel which was then cut inhalf. One half of the gel was stained with Commassie Brilliant BlueG-250 (Thermo Fisher Scientific, Waltham, Mass. USA 02451) and the otherhalf was zymographed. Zymography was performed after exchanging the gelbuffer by soaking the gel for 10 minutes in HBSS containing 1% TritonX-100. The gel was then overlaid with 0.35% agarose in HBSS containing4MU-xylan substrate and allowed to remain on the gel for 8 minutes. Theagarose was then removed and a wetted PVDF membrane was immediatelyoverlaid onto the gel for 5 minutes. The membrane was carefully removed.The membrane and the gel piece were then rinsed with 0.1 M sodium boratebuffer at pH 9.9 and photographed (FIG. 11). Pieces of Coomassie stainedgels corresponding to the regions of the strongest zymographic activitywere sent for protein microsequencing to the Protein and Nucleic AcidAnalysis Core Facility at the Maine Medical Center Research Institute(Scarborough, Me. 04074).

The microsequencing process revealed a number of peptides whose likelypolypeptides of origin were determined by comparison to a UniprotPaenibacillus database. See The UniProt Consortium, Activities at theUniversal Protein Resource (UniProt), Nucleic Acids Res. 42: D191-D198,2014, which is incorporated herein by reference in its entirety. Sixpeptides were investigated that had the best chance of being from an XLEpolypeptide. These were peptides identified as originating frompolypeptides belonging to 1) bacterial flagellin, 2) hydrolase, 3)licheninase, 4) esterase, 5) toxic anion resistance protein and 6)unknown families of proteins.

The DNA sequences corresponding to each of the candidate peptides fromPaenibacillus species were determined from the UniProtKB and NIH genomicdatabases. PCR primer sets were designed for each of the peptides usingthe OligoCalc oligonucleotide properties calculator, as described above.The PCR primers were tested against genomic DNA from E518 to confirmthat they would indeed generate a DNA fragment of the expected size. Ifnot, the gene family members from Paenibacillus species in the UniProtKBand NIH genome databases were aligned and examined for highly conservedregions, and those regions were used for primer redesign. After anyprimer redesign, the resulting primers were tested against genomic DNAfrom E518. This process also served to choose PCR conditions. Inaddition, primer sets were tested against E. coli DNA, to ensure thatthere were no host reactions to complicate the PCR assay. In some cases,the primer pairs were used to amplify genomic DNA from E518 for DNAsequencing. This sequence was then used to design unambiguous primers.

TABLE X  Screening Primers for Genes That May Encode XLE SEQ Primer nameSequence ID NO. R9LCP9FM3 ATGGGGGAACAAYGAACTKCAGTAYYATA 31 R9LCP9MRCALAAMCKTTGGTTRTTGGCGRMRTAG 32 R9seqF1 CAGGTGACGGGTGGAAATCTGG 33 R9seqR1CTGCTGAATCTTCGCTCCGCTG 34 GOVVFM3 GGATGGGGAAACAATGARCTGCAGTAYTAT 35GOVVRM3 CCARTTYCCGCCRACCGCNARRTTCAG 36 C6D588F3GAGCTGGCTACACAATCCGCGAACGGT 37 C6D588R6 CAGAACGCCTTGCGGTTGTTGATTAGCTTG38 S3AYW2F2 GGAGTCCTGGAGCGTGTGACGATGC 39 S3AYW2R2AAATGCCGAAAGGCGCTCGCAAAGCT 40 E0IAJ1F1 CTCACCCGAAGAACGCCAGCTGATGAAC 41E0IAJ1R2 GTGTACGTAATGCGGGGGAACGAAC 42 T2LU54F1TTGAGGTAGCCAGCCCGGAAGAGATCA 43 T2LU54R2A GTGTACGTAATGCGGGGGAACGAAC 44

In order to generate a genomic library, the genome size of E518 wasassumed to be similar to those of its closest relatives: Paenibacilluslautus Y412MC10, 7.1 Mb (See Mead et al., “Complete Genome Sequence ofPaenibacillus strain Y4.12MC10, a Novel Paenibacillus lautus strainIsolated from Obsidian Hot Spring in Yellowstone National Park,”Standards in Genomic Science 6:3 (2012), which is incorporated herein byreference in its entirety), Paenibacillus polymyxa E681, 5.4 Mb (See KimJ F, et al., “Genome sequence of the polymyxin-producing plant-probioticrhizobacterium Paenibacillus polymyxa E681,” J. Bacteriol. 192(22),6103-6104 (2010), which is incorporated herein by reference in itsentirety), Paenibacillus polymyxa SC2, 6.21 Mb (See Ma et al., “Completegenome sequence of Paenibacillus polymyxa SC2, a strain of plantgrowth-promoting Rhizobacterium with broad-spectrum antimicrobialactivity,” J. Bacteriol. 193(1): 311-312 (2011), which is incorporatedherein by reference in its entirety), and Paenibacillus terrae HPL-003,6.1 Mb (See Shin et al., “Genome sequence of Paenibacillus terraeHPL-003, a xylanase-producing bacterium isolated from soil found inforest residue,” J. Bacteriol. 2012 194(5):1266 (2012), which isincorporated herein by reference in its entirety). These values were allabout 6±1 Mb.

The probability that a gene of interest will be covered in a randomlibrary of fragments of genomic DNA is

$N = \frac{\ln\left( {1 - P} \right)}{\ln\left( {1 - {a/b}} \right)}$wherein N is the number of recombinants to be screened, P=theprobability of including a particular sequence in a random genomiclibrary, a=the mean size of the fragments divided by the genome size,and b=the genome size (See Clarke et al., “A colony bank containingsynthetic CoI EI hybrid plasmids representative of the entire E. coligenome,” Cell 9: 91-99 (1976), which is incorporated herein by referencein its entirety). Creating the library in cosmid vectors, where theaverage insert size is 40 kB, reduces the number of clones to bescreened for a 95% chance of finding the gene to about 500, which is areasonable number to screen by a PCR assay. If the library is created ina standard cloning vector with an average insert size of 2 kB, thenumber of recombinants to be screened would be close to 10,000, whichmay not be a reasonable number to screen by PCR.

Genomic DNA was prepared from E518 bp lysing the cells using B-Perreagent (Thermo Fisher Scientific Inc., Rockford, Ill. USA 61101)following manufacturer's instructions, except that 2.5 μg/ml RNase A,0.016 U/ml B. subtilis protease, 0.1 mg/ml lysozyme, and 60 μg/mlproteinase K were added instead of DNase I and any vigorous pipettingwas avoided. The resulting supernatant containing genomic DNA wasextracted twice with phenol:chloroform, and the DNA was precipitatedfrom the aqueous layer with ethanol. The pellet was allowed to air-dryfor at least 1 hour and resuspended gently in TE buffer to a finalconcentration of approximately 0.4 μg/μl.

The genomic DNA was sheared to an average size of approximately 40 kB bypassing it through a 200 μl pipette tip about 50-70 times. The sizerange of the resulting fragments was tested by gel electrophoresis.Multiple preparations of genomic DNA were used for library preparation,and each preparation was tested individually for the appropriate numberof passes through a pipette tip to generate fragments with an averagesize of 40 kB. Sheared DNA was separated by gel electrophoresis on lowmelting-point agarose. Each DNA sample was loaded into two lanes: acontrol lane for location and a lane with a higher concentration of DNAfor isolation. Once the gel was run, the portions containing the controlsample lanes along with a lane containing a cosmid size marker suppliedwith the pWEB kit were cut off and stained with ethidium bromide. Theregion corresponding to approximately 40 kB was cut out of the unstainedportion of the gel and the DNA was isolated from the gel as recommendedby the manufacturer of the pWEB kit (Epicentre Biotechnologies, Madison,Wis. 53719).

A cosmid library from E518 genomic DNA was constructed and plated usinga pWEB cosmid kit (Epicentre Biotechnologies, Madison, Wis. 53719)following manufacturer's directions. Individual colonies were pickedinto individual wells of 96 deep well plates, each well containing 400μl of E. coli growth medium, as well as to a gridded petri dish. Whenthe cultures in the 96 well plate were grown up, 150 μl of each well'scontents was removed and combined with similar aliquots from 7 otherwells in a microfuge tube. The cells from the pooled cultures in thetubes were pelleted by centrifugation and washed twice with distilledwater. 1 μl of a 1:10 dilution of the pooled pelleted cell pellets wasused for each PCR assay. Each of the original wells of the deep wellplates (now containing 250 μl of cell culture) was mixed with 50%glycerol to a final concentration of 25% and the plates were frozen andstored at −80° C. The gridded petri dishes were grown overnight andstored at 4° C. until needed.

When a pooled cell mixture showed a positive reaction by PCR, the eightindividual colonies of the pool were picked from the colonies on thegridded plate that corresponded to the wells used to construct the pool.Each of those colonies was inoculated to an individual petri dish aswell as to an individual LB broth culture. The cell pellets obtainedfrom the individual broth cultures were tested by PCR, and the singlepositive isolate was further purified by amplifying its individualcolonies and retesting by PCR. Freezer stock was made from severalisolates of each positive.

Interestingly, clones positive for the licheninase primer pair were alsopositive for the hydrolase primer pair. The DNA sequence of the PCRproduct generated using the hydrolase primer pair was determined andshown to encode the amino acid sequence of the peptide previouslyidentified as being from an enzyme belonging to the licheninase family.It seemed likely that both the licheninase and hydrolase peptidesidentified by microsequencing were part of the same polypeptide. Theidentical origin of both peptides was confirmed by a PCR analysis. Whenthe forward primer for the licheninase gene was paired with the reverseprimer for the hydrolase gene, a ˜700 bp band was generated. It wasevident that the two peptide fragments were in fact part of the samegene.

The PCR positive isolates were further tested by assaying for XLEactivity. Broth cultures were grown in M9 complete medium and the cellspelleted by centrifugation. The cell pellets were washed twice with HBSSwithout ammonium nitrate and the drained pellets were weighed, treatedwith protease inhibitor cocktail (P8465, Sigma-Aldrich Chemicals, St.Louis, Mo. 63178) and frozen. To assay the pellets for XLE activity, thepellets were thawed on ice, treated with B-Per reagent (Thermo FisherScientific Inc., Rockford, Ill. USA 61101) containing 2.5 μg/ml RNase A,0.1 mg/ml lysozyme, and 1 ml/g cells protease inhibitor cocktail, andincubated at room temperature for 10 minutes. 1.5 volumes of HBSSwithout ammonium nitrate was then added to the treated cells, and thelysed cells were heated to 65° C. for 10 minutes. The tubes werecentrifuged, and the supernatant used in an assay for XLE activity asabove. Culture supernatant from E518 served as a positive control andHBSS was the negative control. Of the five individual clones tested, twowere more strongly positive than the E518 control, one was approximatelyas positive as the E518 control, and two were clearly negative.

Cosmid DNA was extracted from the PCR positive isolates in an attempt tosequence the gene. However, the sequence generated from the cosmids wasnot clean enough for a confident sequence. An alternative sequencingapproach looked for those cosmids in which the xle gene was close enoughto the cosmid insertion site that an xle primer paired with a cosmidprimer could amplify previously unsequenced regions of the xle gene. Allof the PCR positive isolates were tested. One isolate, 5-1F, generatedan approximate 1 kb band when the T7 (cosmid vector) primer was pairedwith the R9seqR1 primer. The sequence generated from this PCR fragmentwas used to obtain the start site and some upstream region of the gene.When the cosmid M13 primer was paired with R9seqF1, two other isolates,3-2G and 4-5F yielded a 4 kb and a 2 kb fragment respectively. Theseisolates were used to determine the remaining downstream sequence ofxle.

When the entire xle gene sequence was put together, it was discoveredthat there was no BamHI site present in the gene. Isolate 1-8C had XLEactivity but did not show a PCR fragment from any vector primer combinedwith any xle primer despite having been positive with licheninase andhydrolase primers, indicating that the xle gene was likely both to becomplete and to be located far from the insertion site. Consequently, asubstantial portion of the upstream and downstream gene regions werelikely to be present. 1-8C was digested with BamHI and the fragmentsligated into the BamHI site of pUC19. Individual transformants werescreened by PCR using primers R9seqF1 and R9seqR1 (see Table X). Onepositive, named pUCXLE44, was found with an insert of approximately 8kb. pUC19XLE44 was grown overnight in LB broth culture, and the plasmidpurified using the GenCatch Plasmid Mini-Prep Kit (Epoch Life SciencesInc, Sugar Land, Tex. 77496). The insert DNA was sequenced using pUC19primers and with the primers developed for PCR of xle from cosmids andE518. Additional primers required to complete the sequence weredeveloped as sequence data became available and are listed in Table XI.

TABLE XI  Additional xle Sequencing Primers SEQ Primer name SequenceID NO. XLEfor1 GCAAAGTCATGGATGTGGTCGATG 45 XLErev2TAATATCCGCCTCCGACATCCACGG 46 INR9revF1 CCAGATTTCCACCCGTCACCTG 47INR9revR1 CAGCGGAGCGAAGATTCAGCAG 48

DNA sequences determined from clones and from walking experiments wereproofread using a Geneious sequence analysis software (BiomattersIncorporated, San Francisco, Calif. 94107) supplemented with manualinspection and sequence reconciliation. An on-line Softberry analysis(Softberry, Inc. Mount Kisco, N.Y. 10549) suggested that the xle gene isan independent transcriptional unit and not part of an operon.

The xle gene (SEQ ID NO:49, FIG. 12) encodes 412 amino acids (SEQ IDNO:50, FIG. 13), and is both preceded and succeeded by multiple stopcodons. A BLAST search (see Altschul et al., “Gapped BLAST andPSI-BLAST: a new generation of protein database search programs,”Nucleic Acids Res. 25:3389-3402 (1997), which is hereby incorporated byreference in its entirety) using either the protein sequence or the DNAsequence revealed its closest relatives (approximately 80% identity ineither case) to be members of the laminarinase-like subfamily ofglycoside hydrolase family 16 with activity towards 1,3 β-glucans. Thehighest levels of identity were with genes from other members of thePaenibacillus genus and with a Bacillus circulans strain. In addition,there is high homology (˜80% identity at the amino acid level) with axylanase from Paenibacillus sp. JCM 10914.

There is a relationship between 1,3 β-glucanase and XLE. Xylose in xylanis in the pyranose conformation, and it has the same stereochemistry atcarbons 1, 2, 3 and 4 as glucose in β-glucan. The major differencebetween the two sugars as residues in a polysaccharide chain is whethera C6 group is attached to C5.

In addition, the carboxyterminus of the sequence contains a sugarbinding site of the ricin superfamily, composed of three repeats of aQXW motif (see Hazes, “The (QxW)3 domain: a flexible lectin scaffold,”Protein Sci. 5(8):1490-1501 (1996), which is hereby incorporated byreference in its entirety). In XLE, the QXW repeats consist of two QQWand one QRW domains (underlined, as shown in FIG. 13).

Although preferred embodiments have been depicted and described indetail herein, it will be apparent to those skilled in the relevant artthat various modifications, additions, substitutions, and the like canbe made without departing from the spirit of the invention and these aretherefore considered to be within the scope of the invention as definedin the claims which follow.

REFERENCES

-   Ragauskas, A. J., Williams, C. K., Davison, B. H., Britovsek, G.,    Cairney, J., Eckert, C. A., Frederick, W. J. Jr., Hallett, J. P.,    Leak, D. J., Liotta, C. L., Mielenz, J. R., Murphy, R., Templer, R.    and Tschaplinski, T. 2006. The path forward for biofuels and    biomaterials. Science 311:484-489.-   ICIS Chemical Business, Feb. 11, 2008. Biofuels backlash grows in    fuel versus food debate. Simon Robinson/London.-   Werpy, T. and Peterson, G. 2004. Top value-added chemicals from    biomass, volume I. Results of screening for potential candidates    from sugars and synthesis gas. U.S. Department of Energy, National    Renewable Energy Laboratory, Publication No. DOE/GO-102004-1992.-   van Heiningen, A. 2006. Converting a kraft pulp mill into an    integrated forest biorefinery. Pulp and Paper Canada 107:38-43.-   Sun, X. F., Sun, R. C., Zhao, L., and Sun, J. X. 2004. Acetylation    of sugarcane bagasse hemicelluloses under mild reaction conditions    by using NBS as a catalyst. Journal of Applied Polymer Science 92:    53-61.-   Robinson, D. 1956. The fluorometric determination of β-glucosidase:    its occurrence in the tissues of animals, including insects.    Biochemical Journal 63:39.-   Sluiter, A., Hames, B., Ruiz, R., Scarlata, C., Sluiter, J.,    Templeton, D., and Crocker, D. 2011. Determination of structural    carbohydrates and lignin in biomass. Laboratory Analytical Procedure    (LAP) (Version Jul. 8, 2011). Technical Report NREL/TP-510-42618.-   Warnick, T. A., Methe, B. A. and Leschine, S. B. 2002. Clostridum    phytofermentans sp. nov., a cellulolytic mesophile from forest soil.    Int. J. Systemat. Evolut. Microbiol. 52:1155-1160.-   Alexander, M. 1965. Biodegradation: problems of molecular    recalcitrance and microbial fallibility. Adv. Appl. Microbiol.    7:35-80.-   Chang, M. C Y. 2007. Harnessing energy from plant biomass. Curr Op    Chem Biol. 11:677-684.-   Danneel, H.-J., Rossnerz, E. Zeeck, A. and Giffhor, F. 1993.    Purification and characterization of a pyranose oxidase from the    basidiomycete Peniophora gigantea and chemical analyses of its    reaction products. Eur. J. Biochem. 214:795-802-   Connell, L., Redman, R., Craig, S., and Rodriguez, R. 2006.    Distribution and abundance of fungi in the soils of Taylor Valley,    Antarctica. Soil Biol. Biochem. 38:3083-3094.-   Lu, Y. 2002. Benzyl konjac glucomannan. Polymer 43:3979-3986.-   Stark, M., Berger, S. A., Stamatakis, A., von Mering, C. 2010.    MLTreeMap—accurate Maximum Likelihood placement of environmental DNA    sequences into taxonomic and functional reference phylogenies. BMC    Genomics 11:461 doi:10.1186/1471-2164-11-461    http://www.biomedcentral.com/1471-2164/11/461.-   Ausubel, F., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J.    G., Smith, J. A., and Struhl, K. A. 1995. Short Protocols in    Molecular Biology. Wiley and Sons, NY.-   Dence, C. W. 1992. The determination of lignin. In Methods in Lignin    Chemistry, S. Y. Lin and C. W. Dence eds. Springer-Verlag, New York.-   Technical Association of the Pulp and Paper Industry, Atlanta 1991.    Official Test Method UM250. Acid-soluble lignin in wood and pulp.    Useful Methods. pp. 47-48.-   Maekawa, F. Ichizawa, T. and Koshijima, T. 1989. An evaluation of    the acid-soluble lignin determination in analysis of lignin by the    sulfuric acid method. J. Wood Chem. Technol. 9:549-569.-   Agarwal U. P. 1998. Assignment of the photo-yellowing-related 1675    cm⁻¹ Raman/IR band to p-quinones and its implications to the    mechanism of color reversion in mechanical pulps. Journal of Wood    Chemistry and Technology 18:381-402.-   Spender, J. 2001. Photostabilization of high-yield pulps reaction of    thiols and quinones with pulp, a master's thesis for the department    of Chemistry, University of Maine.-   Samuel, R., Foston, M., Jiang, N., Allison, L., and    Ragauskas, A. J. 2011. Structural changes in switchgrass lignin and    hemicelluloses during pretreatments by NMR analysis. Polym. Degrad.    Stabil. 96(11):2002-2009.-   Dayhoff (ed.). 1978. Atlas of Protein Sequence and Structure, Vol.    5, Suppl. 3, Natl. Biomed. Res. Round., Washington D.C., pp.    345-352.-   Henikoff, S. and Henikoff, J. G. 1992. Amino acid substitution    matrices from protein blocks. Proc. Natl. Acad. Sci. USA,    89:10915-10919.-   Sambrook et al. 1989. Molecular Cloning, A Laboratory Manual, 2d    edition, Cold Spring Harbor, N.Y.-   Altschul, S. F., Madden, T. L., Scháffer, A. A., Zhang, J., Zhang,    Z., Miller, W. and Lipman, D. J. 1997. Gapped BLAST and PSI-BLAST: a    new generation of protein database search programs. Nucleic Acids    Res., 25:3389-3402.-   Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. 1997.    Identification of prokaryotic and eukaryotic signal peptides and    prediction of their cleavage sites. Protein Eng. 10:1-6.-   Prongjit, M., Sucharitakul, J., Wongnate, T., Haltrich, D. and    Chaiyen, P. 2009. Kinetic mechanism of pyranose 2-oxidase from    Trametes multicolor. Biochem. 48(19):4170-4180.-   Kibbe, W. A. 2007. OligoCalc: an online oligonucleotide properties    calculator. Nucl. Acids Res., 35(2):W43-W46.-   Janson, J.-C. and Ryden, L. (eds). 1989. Protein Purification:    Principles, High Resolution Methods and Applications, VCH    Publishers, Inc., New York.-   Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li,    W., Lopez, R., McWilliam, H., Remmert, M., Södding, J., Thompson, J.    D., Higgins, D. G. 2011. Fast, scalable generation of high-quality    protein multiple sequence alignments using Clustal Omega. Mol. Syst.    Biol., 7(539):1-6.

What is claimed is:
 1. A cDNA encoding a polypeptide that specificallycleaves a non-glycosidic ether bond between a lignin or a derivativethereof and a saccharide wherein said polypeptide comprises an aminoacid sequence having at least 85% sequence identity to the amino acidsequence of SEQ ID NO:2 or SEQ ID NO:4 or SEQ ID NO:
 50. 2. The cDNA ofclaim 1, wherein cleavage of the non-glycosidic ether bond is between anaromatic carbon of the lignin or the derivative thereof and thesaccharide.
 3. The cDNA of claim 1, wherein cleavage of thenon-glycosidic ether bond is between a non-aromatic carbon of the ligninor the derivative thereof and the saccharide.
 4. The cDNA of claim 3,wherein said non-aromatic carbon of the lignin is an α-linked benzylcarbon or a β-linked benzyl carbon.
 5. The cDNA of claim 1, wherein saidsaccharide is selected from the group consisting of a monosaccharide, adisaccharide, an oligosaccharide, and a polysaccharide.
 6. The cDNA ofclaim 1, wherein said saccharide is a polysaccharide and saidpolysaccharide is hemicellulose.
 7. The cDNA of claim 1 wherein thepolypeptide comprises an amino acid sequence having at least about90-95% sequence identity to the amino acid sequence of SEQ ID NO:2 orSEQ ID NO: 4 or SEQ ID NO:
 50. 8. The cDNA of claim 1, wherein said cDNAcomprises a nucleotide sequence of SEQ ID NO:1 or SEQ ID NO:3 or SEQ IDNO:
 49. 9. A nucleic acid construct or an expression vector comprisingthe cDNA of claim 1 wherein said cDNA is operably linked to one or morecontrol sequences that direct the expression of the polypeptide in anexpression host.
 10. The nucleic acid construct or expression vector ofclaim 9, wherein said construct or vector is selected from the groupconsisting of pHIS525-cMLE, pHIS525-cfMLE, pAES40-cMLE, pAES40-cfMLE,pHT43-cMLE, pHT43-cfMLE, pBluescript SK-cMLE, pBluescript SK-cfMLE,pFN6A-cMLE and pFN6A-cfMLE.
 11. A transformed host cell comprising anexpression vector that comprises the cDNA of claim 1 wherein said cDNAis operably linked to one or more control sequences that direct theproduction of the polypeptide.
 12. The transformed host cell accordingto claim 11, wherein said host cell is selected from the groupconsisting of B. megaterium (pHIS525-cMLE), B. subtilis (pHIS525-cMLE),B. megaterium (pHIS525-cfMLE), B. subtilis (pHIS525-cfMLE), E. coli(pAES40-cMLE), E. coli (pAES40-cfMLE), B. subtilis (pHT43-cMLE), B.subtilis (pHT43-cfMLE), E. coli (pBluescript SK⁻-cMLE), E. coli(pBluescript SK⁻-cfMLE), E. coli (pFN6A-cMLE) and E. coli (pFN6A-cfMLE).13. A method of producing a heterologous polypeptide that specificallycleaves a non-glycosidic ether bond between a lignin or a derivativethereof and a saccharide, comprising: (a) Cultivating the transformedhost cell of claim 12 under conditions conducive for production of theheterologous polypeptide; and (b) recovering the heterologouspolypeptide.
 14. The cDNA of claim 1, wherein said polypeptide comprisesthe amino acid sequences of SEQ ID NO:50.
 15. The cDNA of claim 14,wherein said polypeptide comprises an amino acid sequence having atleast about 85% sequence identity to the amino acid sequence of SEQ IDNO:50.
 16. The cDNA of claim 15 wherein said polypeptide comprises anamino acid sequence having at least about 90-95% sequence identity tothe amino acid sequence of SEQ ID NO:50.
 17. The cDNA of claim 1,wherein said cDNA comprises a nucleotide sequence of SEQ ID NO:49.